As many people may already be aware, correctly validating email addresses can be somewhat of a nightmare. You can search all day long for a C# regex that matches the current RFC standards, and you'll find different regex expressions that give different results.
If you look at http://en.wikipedia.org/wiki/Email_address#Local_part, you'll see that a period at the beginning or end of the local part is not allowed. Two consecutive periods are also not allowed. However, the following NUnit test proves that System.Net.MailMessage allows you to instantiate a MailMessage object for some invalid email address formats.
[Test]
[TestCase(@"foobar@exampleserver")] //technically valid from the wiki article
[TestCase(@"jsmith@[192.168.2.1]")] //technically valid from the wiki article
[TestCase(@"niceandsimple@example.com")] //vanilla email address
[TestCase(@"very.common@example.com")] //also standard
[TestCase(@"a.little.lengthy.but.fine@dept.example.com")] //long with lots of periods
[TestCase(@"disposable.style.email.with+symbol@example.com")] //disposable with the + symbol
[TestCase(@"other.email-with-dash@example.com")] //period and dash in local part
[TestCase(@"user-test-hyphens@example-domain.com")] //lots of hyphens
[TestCase(@"!#$%&'*+-/=?^_`{|}~@example-domain.com")] //all these symbols are allowed in local part
[TestCase(@"ër_%لdev@gكňil.com")] //characters outside the ascii range are permitted
[TestCase(@"""abcdefghixyz""@example.com")] //technically valid
//[TestCase(@"abc.""defghi"".xyz@example.com")] //technically valid, but .NET throws exception
public void CanCreateMailMessageObjectTest(string emailAddress)
{
var mailMessage = new System.Net.Mail.MailMessage("noreply@example.com", emailAddress);
}
All of the above test cases pass except the last one.
[Test]
[TestCase(@".test@example.com")] //leading period
[TestCase(@"test.@example.com")] //period at end of local part <---FAIL
[TestCase(@"test..example@example.com")] //double period in local part <---FAIL
[TestCase(@"foobar@example!#$%^&*()=server.com")] //special characters in domain part
[TestCase(@"Abc.example.com")] //No @ separating local and domain part
[TestCase(@"A@b@c@example.com")] //more than one @ symbol
[TestCase(@"just""not""right@example.com")] //quoted strings must be dot separated
[TestCase(@"a""b(c)d,e:f;g<h>i[j\k]l@example.com")] //special symbols "(),:;<>@[\] not inside quotes
[TestCase(@"[test@example.com")] //leading special symbol in local part
[TestCase(@"this is""not\allowed@example.com")] //spaces not in quotes
[TestCase(@"this\ still\""not\\allowed@example.com")] //backslashes not in quotes
[ExpectedException(typeof (System.FormatException))]
public void CannotCreateMailMessageObjectTest(string emailAddress)
{
var mailMessage = new System.Net.Mail.MailMessage("noreply@example.com", emailAddress);
}
Why on earth do test.@example.com
and test..example@example.com
fail to throw a System.FormatException? Who is wrong here, Microsoft or Wikipedia? Are there any email addresses that are grandfathered in to allow trailing period or double period? Should my validation allow them? I have the proper exception handling in place to allow my email delivery service to carry on about its day if an exception occurs, but I'd like to throw out email addresses that are either invalid or guaranteed to throw an exception.