0

Good Evening, I am working on creating a regular expression to validate email addresses in Javascript. I have the function working, however I know this expression could be written better. How could I improve it?

function checkEmailAddress(inputEmailAddy){
    var regex = new RegExp(\"^([A-Z][a-z][0-9][_][-][.])+\@([A-Z][a-z][0-9][_][-][.])+\.([A-Za-z]{2,4})$/\");

    return (regex.test(inputEmailAddy) == 1 ? 1 : -1);
}

Thanks, Mike

Andrius
  • 947
  • 15
  • 22
mightymouse3062
  • 109
  • 5
  • 13
  • 1
    Maybe: http://www.linuxjournal.com/article/9585 – Jared Farrish Oct 28 '11 at 00:15
  • 2
    The quintessential email address regex: http://stackoverflow.com/questions/1903356/email-validation-regular-expression – Clive Oct 28 '11 at 00:16
  • 1
    This has been asked many times, try this search: http://stackoverflow.com/search?q=Email+address+validation+regex&submit=search – Nathan Manousos Oct 28 '11 at 00:16
  • 1
    possible duplicate of [What is the best regular expression for validating email addresses?](http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses) – nrabinowitz Oct 28 '11 at 00:25
  • Regular expressions: http://www.regular-expressions.info/email.html – Galled Oct 28 '11 at 00:32
  • Possible duplicate of [Using a regular expression to validate an email address](http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address) – Andrius Apr 30 '17 at 14:19

3 Answers3

2

The local-part of the email, to the left of the @ sign, can contain absolutely anything if quoted properly, and you simply cannot interpret quoting with a regular expression - you must parse the email address according to RFC rules, or you will reject some valid email addresses.

Even with a regex that is "good enough" you still have to send a confirmation email to verify that it's a legitimate address.

(most of the answers suggested here on SO to the many times this question has been asked will fail, and reject my email address, because I have a plus + my address)

Stephen P
  • 14,422
  • 2
  • 43
  • 67
0

Here an example code

<input type="text" id="email">
function test(){
var emailElement = document.getElementById('#email');
var emailPattern = /^[a-zA-Z]([a-zA-Z0-9_\-])+([\.][a-zA-Z0-9_]+)*\@((([a-zA-Z0-9\-])+\.){1,2})([a-zA-Z0-9]{2,40})$/;

if(!emailPattern(emailElement.value)){
    return false;
   } else {
    return true;
  }
}
Anup Panwar
  • 293
  • 4
  • 11
0

I usually use this one in PHP's eregi (i know it's deprecated):

'^[[:alnum:]][a-z0-9_\.\-]*@[a-z0-9\.\-]+\.[a-z]{2,4}$'

I quickly changed it to work in JS:

/^[a-z][a-z0-9_\.\-]*@[a-z0-9\.\-]+\.[a-z]{2,4}$/i

Quick check:

var r = /^[a-z][a-z0-9_\.\-]*@[a-z0-9\.\-]+\.[a-z]{2,4}$/i;

r.test('someone@somesite.com'); // true
r.test('xyz@xyz.xyz'); // true
r.test('abc@3'); // false
r.test('xyz'); // false
r.test('asdf@asdf.asdfasdfasdf'); // false

One gotchya is that I'm using {2,4} for the last part, so it matches things like .net and .com. But it won't match valid ones like .museum while it will match non-existant ones like .xx

Now, just for the lulz, I've crated a regexp similar to the one above but instead of the [a-z]{2,4} I set it up to match every valid domain I am aware of:

/^[a-z][a-z0-9_\.\-]*@[a-z0-9\.\-]+\.(?:aero|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|tel|travel|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|ee|eg|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|xxx|ye|yt|za|zm|zw)$/i

giving you:

r.test('someone@somesite.com'); // true
r.test('someone@somefakesite.xb'); // false
r.test('asdf'); // false
r.test('.museum'); // false
r.test('someone@somemuseumsite.museum'); // true

This, of course, makes for a crazy long regexp and (if you want it to hold up over time) will have to be maintained.

Also, in all of these cases, some valid (but very uncommon) addresses will fail. Something like somebody@[192.168.2.1]

Marshall
  • 4,716
  • 1
  • 19
  • 14
  • 2
    It would also reject **my** email address, which has a `+` in it. See the monstrosity mentioned in the answer referenced by Clive's comment to the question. – Stephen P Oct 28 '11 at 00:46
  • Thanks! Yeah, that looks way crazier yet better. – Marshall Oct 28 '11 at 00:54
  • 1
    Considering that now everyone with enough money can buy new top domain level, hard-coding top-domain level in a regex is a quite bad idea. You're almost guaranteed that it will be wrong in the future. – HoLyVieR Oct 28 '11 at 01:07
  • 1
    thats why is said 'for the lulz' - just showing how overboard a regexp for email can be. – Marshall Oct 28 '11 at 17:02