20

Some HTML5 input elements accept the pattern attribute, which is a regex for form validation. Some other HTML5 input elements, such as, input type=email does the validation automatically.

Now it seems that the way validation is handled is different accross browsers. Given a specific browser, say Chrome, is it possible to programmatically extract the regex used for validation? Or maybe there is documentation out there?

serk
  • 4,329
  • 2
  • 25
  • 38
Randomblue
  • 112,777
  • 145
  • 353
  • 547
  • @jfriend00 Take a look here : http://www.regular-expressions.info/email.html – FailedDev Oct 16 '11 at 17:50
  • @jfriend00: It depends on what your notion of a "valid email address" is. As indicated in [this answer][http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses/201378#201378], RFC822 can be covered by a regex, though RFC5322 cannot. HTML5 specifies a much narrower notion of a "valid email address" which can be validated using a regex (see my answer). – ig0774 Oct 16 '11 at 18:19

3 Answers3

27

The HTML5 spec currently lists a valid email address as one matching the ABNF:

1*( atext / "." ) "@" ldh-str *( "." ldh-str )

which is elucidated in this question. @SLaks answer provides a regex equivalent.

That said, with a little digging through the source, shows that WebKit implemented email address validation using basically the same regex as SLaks answer, i.e.,

[a-z0-9!#$%&'*+/=?^_`{|}~.-]+@[a-z0-9-]+(\.[a-z0-9-]+)*

However, there is no requirement that email addresses be validated by a regex. For example, Mozilla (Gecko) implemented email validation using a pretty basic finite state machine. Hence, there needn't be a regex involved in email validation.

Community
  • 1
  • 1
ig0774
  • 39,669
  • 3
  • 55
  • 57
  • [a-z0-9!#$%&'*+/=?^_`{|}~.-]+@[a-z0-9-]+(\.[a-z0-9-]+)* matches word_word.word@word in FF9 linux – jpse Jan 24 '12 at 13:38
  • @jpse: If you read the linked spec above, you'll note that it states that it *willfully violates RFC 5322*, or, in other words, the validation used for HTML email fields fails to match some valid email addresses and matches some strings that are not valid email addresses. – ig0774 Jan 24 '12 at 15:11
1

The HTML5 spec now gives a (non-normative) regex which is supposed to exactly match all email addresses that it specifies as valid. There's a copy of it on my blog here: http://blog.gerv.net/2011/05/html5_email_address_regexp/ and in the spec itself: https://html.spec.whatwg.org/#e-mail-state-(type=email))

The version above is incorrect only in that it does not limit domain components to max 255 characters and does not prevent them beginning or ending with a "-".

Gerv

Gervase Markham
  • 287
  • 2
  • 8
-1

this works for me: pattern="[^@]+@[^@]+.[a-zA-Z]{2,6}"

lll
  • 3
  • 2