How to get the nth match from regexp_matches() as plain text

Question

I have this code:

with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)') from demo

And the table I get is:

regexp_matches
{hello}
{hi}

What I would like to do is:

with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)')[1] from demo

Or even the big query version:

with demo as (
select 'WWW.HELLO.COM' web
union all
select 'hi.co.uk' web)
select regexp_matches(replace(lower(web),'www.',''),'([^\.]*)')[offset(1)] from demo

But neither works. Is this possible? If it isn't clear, the result I would like is:

match
hello
hi

Erwin Brandstetter · Answer 1 · 2021-05-01T23:29:05.803

Use split_part() instead. Simpler, faster. To get the first word, before the first separator .:

WITH demo(web) AS (
   VALUES
      ('WWW.HELLO.COM')
    , ('hi.co.uk')
   )
SELECT split_part(replace(lower(web), 'www.', ''), '.', 1)
FROM   demo;

db<>fiddle here

See:

Split comma separated column data into additional columns

regexp_matches() returns setof text[], i.e. 0-n rows of text arrays. (Because each regular expression can result in a set of multiple matching strings.)

In Postgres 10 or later, there is also the simpler variant regexp_match() that only returns the first match, i.e. text[]. Either way, the surrounding curly braces in your result are the text representation of the array literal.

You can take the first row and unnest the first element of the array, but since you neither want the set nor the array to begin with, use split_part() instead. Simpler, faster, and less versatile. But good enough for the purpose. And it returns exactly what you want to begin with: text.

score 0 · Accepted Answer · answered May 01 '21 at 23:07

0

I'm a little confused. Doesn't this do what you want?

with demo as (
      select 'WWW.HELLO.COM' web
      union all
      select 'hi.co.uk' web
     )
select (regexp_matches(replace(lower(web), 'www.',''), '([^\.]*)'))[1]
from demo

This is basically your query with extra parentheses so it does not generate a syntax error.

Here is a db<>fiddle illustrating that it returns what you want.

answered May 01 '21 at 23:07

Gordon Linoff

1,242,037
58
646
786

This is it, of course the other solution worked also. Why the extra parenthesis? I'm a little confused – Henrietta Martingale May 02 '21 at 22:12
@HenriettaMartingale . . . I assume you mean the parentheses around `regexp_matches()`. That is so the expression can use `[]` for addressing an array element. I think of it as a quirk of Postgres syntax. – Gordon Linoff May 03 '21 at 01:07

How to get the nth match from regexp_matches() as plain text

2 Answers2