1

I am novice to regular expressions. I am trying to remove emails from a list which do not belong to a specific domain.

for e.g. I have a below list of emails:

John@yahoo.co.in , Jacob@gmail.com, Bob@rediff.com,
Lisa@abc.com, sam@gmail.com , rita@yahoo.com

I need to get only the gmail ids:

Jacob@gmail.com, sam@gmail.com

Please note we may have spaces before the comma delimiters. Appreciate any help!

Vadym Pechenoha
  • 574
  • 5
  • 17

3 Answers3

2

This could be a start for you.

SELECT *
  FROM (    SELECT REGEXP_SUBSTR (str,
                                  '[[:alnum:]\.\+]+@gmail.com',
                                  1,
                                  LEVEL)
                      AS SUBSTR
              FROM (SELECT ' John@yahoo.co.in , Jacob.foo@gmail.com,   Bob@rediff.com,Lisa@abc.com, sam@gmail.com , sam.bar+stackoverflow@gmail.com, rita@yahoo.com, foobar '
                              AS str
                      FROM DUAL)
        CONNECT BY LEVEL <= LENGTH (REGEXP_REPLACE (str, '[^,]+')) + 1)
 WHERE SUBSTR IS NOT NULL ;

Put in a few more examples, but an email checker should comply to the respective RFCs, look at wikipedia for further knowledge about them https://en.wikipedia.org/wiki/Email_address

Inspiration from https://stackoverflow.com/a/17597049/869069

XING
  • 9,608
  • 4
  • 22
  • 38
oey
  • 87
  • 6
0

Here's a method using a CTE just for a different take on the problem. First step is to make a CTE "table" that contains the parsed list elements. Then select from that. The CTE regex handles NULL list elements.

with main_tbl(email) as (
  select ' John@yahoo.co.in , Jacob.foo@gmail.com,   Bob@rediff.com,Lisa@abc.com, sam@gmail.com , sam.bar+stackoverflow@gmail.com, rita@yahoo.com, foobar '
  from dual
),
email_list(email_addr) as (
  select trim(regexp_substr(email, '(.*?)(,|$)', 1, level, NULL, 1))
  from main_tbl
  connect by level <= regexp_count(email, ',')+1
)
-- select * from email_list;
select LISTAGG(TRIM(email_addr), ', ') WITHIN GROUP ( ORDER BY email_addr )  
from email_list
where lower(email_addr) like '%gmail.com';
Gary_W
  • 9,933
  • 1
  • 22
  • 40
0

Rather than suppress the emails not matching a particular domain (in your example, gmail.com), you might try getting only those emails that match the domain:

WITH a1 AS (
    SELECT 'John@yahoo.co.in , Jacob@gmail.com, Bob@rediff.com,Lisa@abc.com, sam@gmail.com , rita@yahoo.com' AS email_list FROM dual
)
SELECT LISTAGG(TRIM(email), ',') WITHIN GROUP ( ORDER BY priority )
  FROM (
    SELECT REGEXP_SUBSTR(email_list, '[^,]+@gmail.com', 1, LEVEL, 'i') AS email
         , LEVEL AS priority
      FROM a1
   CONNECT BY LEVEL <= REGEXP_COUNT(email_list, '[^,]+@gmail.com', 1, 'i')
);

That said, Oracle is probably not the best tool for this (do you have these email addresses stored as a list in a table somewhere? If so then @GordonLinoff's comment is apt - fix your data model if you can).

David Faber
  • 12,277
  • 2
  • 29
  • 40