-1

The following regex pattern (for PHP) is meant to validate any email address:

^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$

It says: "match at least one (or more) of upper- and/or lower-case letters, and/or periods, underscores and/or dashes followed by one and only one @ followed by at least one (or more) of upper- and/or lower-case letters, and/or periods, and/or underscores followed by one and only one period followed by two to six upper- and/or lower-case letters.

This seems to match any email address I can think of. Still, this feeling of getting it right is probably deceptive. Can someone knowledgeable please point out an obvious or not-so-obvious vulnerability in this pattern that I'm not aware of, which would make it not perform the email validation the way it's meant to?

(To foresee a possible response, I'm aware that filter_var() function offers a more robust solution, but I'm specifically interested in the regular expression in this case.)

NOTE: this is a theoretical question about PHP flavor of regex, NOT a practical question about validating emails. I merely want to determine the limitations of what is reasonably possible with regex in this case.

Thank you in advance!

Dimitri Vorontzov
  • 7,834
  • 12
  • 48
  • 76
  • 1
    Email may contain plus (`+`) symbols, for example. You already saw [RFC-complicant regex](http://ex-parrot.com/~pdw/Mail-RFC822-Address.html), didn't you? Also see [this question](http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address) – galymzhan Feb 14 '13 at 03:55
  • I did. I am asking the question because I want to see an example of a practical email address that would not be matched by this pattern. – Dimitri Vorontzov Feb 14 '13 at 04:01

2 Answers2

1

Using regular expression to validate emails is tricky

Try the following email as an input to your regex ie:^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$

abc@b...com

You can read more about email regex validation at http://www.regular-expressions.info/email.html

If you are doing this for an app then use email validation by sending an email to the address provided rather than using very complex regex.

Josnidhin
  • 12,469
  • 9
  • 42
  • 61
1

The email address specification is pretty nuts. There are regexen out there that can do a full validation for it, but they are thousands of characters long. It may be better to parse it on your own, but PHP has a built in validator for email addresses:

filter_var($email, FILTER_VALIDATE_EMAIL);

EDIT:

In answer to your specific question of an email address that will fail, any that has the email name in quotes will because you don't account for them at all:

"explosion-pills"@aysites.com
Explosion Pills
  • 188,624
  • 52
  • 326
  • 405
  • Thanks. I am aware of the filter_val() option. But that's not what I was asking. – Dimitri Vorontzov Feb 14 '13 at 04:02
  • @DimitriVorontzov If it is only one period as in "...or lower-case letters, and/or periods, and/or underscores followed by one and only one period followed by two to six upper- and/or lower-case letters." then would me@ofsol.gov.uk fail? – Steve Dec 05 '15 at 17:52