0

I am using PHP to find e-mail address in a given text. My current regex is:

'/([\w+\.]*\w+@[\w+\.]*\w+[\w+\-\w+]*\.\w+)/is'

It is consuming a lot of CPU resources. Is there any optimized and low resource utilized ( i.e CPU ) RegEx for finding a Valid E-mails in a given text.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Siddharth sharma
  • 1,717
  • 1
  • 13
  • 8
  • 3
    Possible duplicate of [Using a regular expression to validate an email address](http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address) – randers May 16 '16 at 08:14
  • 1
    use the regex mentioned by @RAnders00 or you need to use atomic groups or possessive quantifiers,otherwise there will be too much of backtracking __(your regex is incorrect though)__ – rock321987 May 16 '16 at 08:18
  • Thanks @RAnders00 for pointing that link.. – Siddharth sharma May 16 '16 at 08:31

3 Answers3

1

This

/^[^@]+@[a-z]+(\.[a-z]+)+$/

is better than yours.

Why?

Let's say we want to test this email: foo@bar.co.uk

In case of success my regexp perform 14 steps to find the solution.
Yours in 22 steps.

BUT THE BIGGEST DIFFERENCE IS IN NON-MATCHING CASE

Let's say we want to test this email: foo@bar.co.uk.foo.

My regex performs 31 steps and fails

Yours (that should be modified with ^ and $ delimiters, otherwise it will match this as a good one) performs 292 steps and fails!

Community
  • 1
  • 1
DonCallisto
  • 29,419
  • 9
  • 72
  • 100
1

Sometimes trading off some false positives for better performance is desirable:

/[^ @]*@[^ ]*/

This should be quite fast. It will also match stuff like __imp__MessageBoxW@16, but such constructs aren't that common in normal text.

a3f
  • 8,517
  • 1
  • 41
  • 46
0

Try this

/[-\d\w\W]+@[-\d\w.+_]+.\w{2,4}/

Matches:

hello.world@example.com my_guru@lcoalhost guy.31@site.co, etc...

Tested at http://regexr.com/

keziah
  • 564
  • 1
  • 6
  • 24