2

Looking at the posts here for email address validation, I am looking to be much more liberal about the client side test I am performing.

The closest I have seen so far is:

^([\w-\.]+)@((\[[0–9]{1,3}\.[0–9]{1,3}\.[0–9]{1,3}\.)|(([\w-]+\.)+))
([a-zA-Z]{2,4}|[0–9]{1,3})(\]?)$

That will not match this#strnage@foo.com, which according to RFC is valid

  • Uppercase and lowercase English letters (a-z, A-Z)
  • Digits 0 through 9
  • Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
  • Character . (dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively.

I want a pretty simple match:

  • Does not start with .
  • Any character allowed up to the @
  • Any character allowed after the @
  • No consecutive . or @ allowed
  • Part after the last . (tld) must be [a-z0-9-]

I will use \i to make the search case insensitive. The consecutive characters is where I am getting hung up on.

user170579
  • 8,180
  • 6
  • 24
  • 21
  • I have been working on one, looks like this is going to cover it broadly .+@(?:[-a-z0-9]+\.)+[a-z]{2,10} I do not suppose there will be a tld longer than 10 chars, .museum seems to be the current record holder in strangeness. – user170579 Sep 22 '09 at 01:51
  • http://stackoverflow.com/questions/3232/how-far-should-one-take-e-mail-address-validation/300862#300862 – some Sep 22 '09 at 02:07
  • possible duplicate of [How to validate an email address in PHP](http://stackoverflow.com/questions/12026842/how-to-validate-an-email-address-in-php) (see the regex pattern in there) – PeeHaa Aug 08 '13 at 09:35

7 Answers7

4

If you want to match against the official standard, you can use

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

So even when following official standards, there are still trade-offs to be made. Don't blindly copy regular expressions from online libraries or discussion forums. Always test them on your own data and with your own applications.

Esteban Küber
  • 36,388
  • 15
  • 79
  • 97
1
/^[^.].*@(?:[-a-z0-9]+\.)+[-a-z0-9]+$/
Matthew Scharley
  • 127,823
  • 52
  • 194
  • 222
  • 1
    Seems near perfect, one exception:
    test@example.com
    .test@example.com
    test@example.example.com
    test@example..com
    t#est@example.com
    t#est@exa#mple.com <-this matches
    t#est@exa#mple.c#om
    Anything after the last @ should be only [a-z0-9-] (valid domain chars) then a dot, then another [a-z0-9-]
    – user170579 Sep 22 '09 at 01:21
  • In your question, you said only the TLD name should be [-a-z0-9]. Fixing that is trivial. – Matthew Scharley Sep 22 '09 at 01:52
0

A very Perl-ish RFC822 compliant regular expression can be found here

Steen
  • 6,573
  • 3
  • 39
  • 56
0
function validator(email) {
   var bademail = false;
   bademail = (email.indexOf(".") == 0) ? true : bademail;
   bademail = (email.indexOf("..") != -1) ? true : bademail;
   bademail = (email.indexOf("@@") != -1) ? true : bademail;
   if(!bademail) {
      var tldTest = new RegExp("[a-z0-9-]");
      var lastperiodpos = email.lastIndexOf(".");
      var tldstr = email.slice(lastperiodpos + 1);
      bademail = (!(tldTest.test(tldstr))) ? true : bademail;
   } 
   return bademail;
}
Martin York
  • 257,169
  • 86
  • 333
  • 562
Anthony
  • 36,459
  • 25
  • 97
  • 163
  • 1
    +1 because some boor gave you a -1 without leaving a comment. I hate that! – TrueWill Sep 22 '09 at 01:12
  • Thanks. I just figured I'd actually keep it simple, as requested, rather than involve regex where it's not needed. Wish I could have thought of a way to use it at the end that wasn't convoluted. – Anthony Sep 22 '09 at 01:35
  • -1. I don’t think, that ".." is generally illegal in an email address. A friend of mine once had such an email address. You should simply remove that rule. And what about special characters in the domain part? They are also allowed, but must be translated according to [RFC 3492](http://tools.ietf.org/html/rfc3492). So this is not a correct answer. – pvorb Mar 28 '12 at 00:40
  • @pvorb: You are wrong... ".." (consecutive dots) *is* generally illegal in both the local part and domain part of the address. http://en.wikipedia.org/wiki/E-mail_address#Local_part – Eric J. Apr 04 '12 at 18:03
  • 1
    @EricJ.: Tanks for clarification. It seems like I and the email provider of my friend have been wrong. – pvorb Apr 05 '12 at 09:14
0

It depends on who is using your applications. For internal applications, often a username is a valid email address. Much of the RFC-822 email spec describes additional fields which may be present in an email address. For example, Allen Town , is a pretty standard email address which you might type into your favorite mail client. However, for an application, you may want to be the one adding the name to the email address when you send email, and don't want that to be part of the users address.

The most liberal way of validating an email address is to just attempt to send an email to whatever address the user gives. If they receive the email, and can confirm it, then it's a valid address.

brianegge
  • 29,240
  • 13
  • 74
  • 99
  • I understand this, but I would like some up front validation. Just so the aol users can not make mistakes :) There will be no local delivery, so the email must be in the format of user@domain.tld – user170579 Sep 22 '09 at 01:34
0

The following has been useful for me for quite sometime now.

function validateEmail(email) { 
    var re = /^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;
    return re.test(email);
} 
Roy M J
  • 6,926
  • 7
  • 51
  • 78
-1

Perfect validation regex is probably hard to match, but I've used this one for quite some time:

/^([\w-\.\+])+\@([\w-]+\.)+([\w]{2,6})+$/

Only changed it recently to match 6-char TLDs.

Ain Tohvri
  • 2,987
  • 6
  • 32
  • 51