We all know email address verification is a touchy subject, there are so many opinions on the best way to deal with it without encoding for the entire RFC. But since 2009 its become even more difficult and I haven't really seen anyone address the issue of IDN's yet.
Here is what I've been using:
preg_match(/^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,6}\z/i)
Which will work for most email addresses but what if I need to match a non Latin email address? e.g.: bob@china.中國, or bob@russia.рф
Look here for the complete list. (Notice all the non Latin domain extensions at the bottom of the list.)
Information on this subject can be found here and I think what they are saying is these new characters will simply be read as '.xn--fiqz9s' and '.xn--p1ai' on the machine level but I'm not 100% sure.
If it is, does that mean the only change I need to consider making in my code the following? (For domain extensions like .travelersinsurance and .sandvikcoromant)
preg_match(/^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,20}\z/i)
NOTICE: This is not related to the discussion found on this page Using a regular expression to validate an email address