3

I have a longtext column "description" in my table that sometimes contains an email address. I need to extract this email address and add to a separate column for each row. Is this possible to do in MySQL?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
David Bailey
  • 603
  • 2
  • 7
  • 19
  • Of course yes. Please explain more what you have try so far and what is your database structure. – lvarayut Jun 12 '13 at 16:32
  • Basically, I have a "description" column and another column called "email" and I need to look for the email address in the description and add to the email column. I think I will write a quick Python script to do this, but was interested if there was a way to do it directly in the database. – David Bailey Jun 12 '13 at 17:16

5 Answers5

5

Yes, you can use mysql's REGEXP (perhaps this is new to version 5 and 8 which may be after this question was posted.)

SELECT *, REGEXP_SUBSTR(`description`, '([a-zA-Z0-9._%+\-]+)@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})') AS Emails FROM `mytable`;
Saj
  • 781
  • 8
  • 7
4

You can use substring index to capture email addresses...

The first substring index capture the account.
The second substring_index captures the hostname. It is necessary to pick the same email address in case the are multiple atso (@) stored in the column.

select concat( substring_index(substring_index(description,'@',1),' ',-1)
             , substring_index(substring_index( description,
                                                substring_index(description,'@',1),-1),
                               ' ',1))
RMathis
  • 588
  • 2
  • 7
3

You can't select matched part only from Regular expression matching using pure Mysql. You can use mysql extension (as stated in Return matching pattern, or use a scripting language (ex. PHP).

Community
  • 1
  • 1
Yousf
  • 3,957
  • 3
  • 27
  • 37
  • @hd1 I was talking about matching (selecting the matched part only). You can't do this using pure mysql. – Yousf Jun 13 '13 at 09:29
1

MySQL does have regular expressions, but regular expressions are not the best way to match email addresses. I'd strongly recommend using your client language.

hd1
  • 33,938
  • 5
  • 80
  • 91
0

If you can install the lib_mysqludf_preg MySQL UDF, then you could do:

SET @regex = "/([a-z0-9!#\$%&'\*\+\/=\?\^_`\{\|\}~\-]+(?:\.[a-z0-9!#\$%&'\*\+\/=\?^_`{\|}~\-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|aero|arpa|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|post|pro|tel|travel|xxx))/i";

SELECT 
    PREG_CAPTURE(@regex, description) 
FROM 
    example 
WHERE
    PREG_CAPTURE(@regex, description) > '';

to extract the first email address from the description field.

I can't think of another solution, as the REGEXP operator simply returns 1 or 0, and not the location of where the regular expression matched.

Ross Smith II
  • 11,799
  • 1
  • 38
  • 43