4

Based on this answer... Using a regular expression to validate an email address

Which led me to this site... http://fightingforalostcause.net/misc/2006/compare-email-regex.php

I'd like to use this regex for email validation for my ASP.NET MVC app:

/^[-_a-z0-9\'+*$^&%=~!?{}]++(?:\.[-_a-z0-9\'+*$^&%=~!?{}]+)*+@(?:(?![-.])[-a-z0-9.]+(?<![-.])\.[a-z]{2,6}|\d{1,3}(?:\.\d{1,3}){3})(?::\d++)?$/iD

Unfortunately, I get this error

System.ArgumentException was unhandled by user code Message="parsing \"/^[-_a-z0-9\'+$^&%=~!?{}]++(?:\.[-_a-z0-9\'+$^&%=~!?{}]+)*+@(?:(?![-.])[-a-z0-9.]+(?

Has anyone ever converted this to be usable by .NET's Regex class, or is there another .NET regular expression class that is a better fit with PHP's preg_match function?

Community
  • 1
  • 1
devuxer
  • 41,681
  • 47
  • 180
  • 292

4 Answers4

5

The problem with your regular expression in .NET is that the possessive quantifiers aren't supported. If you remove those, it works. Here's the regular expression as a C# string:

@"^[-_a-z0-9\'+*$^&%=~!?{}]+(?:\.[-_a-z0-9\'+*$^&%=~!?{}]+)*@(?:(?![-.])[-a-z0-9.]+(?<![-.])\.[a-z]{2,6}|\d{1,3}(?:\.\d{1,3}){3})(?::\d+)?$"

Here's a test bed for it based on the page you linked to, including all the strings that should match and the first three of those that shouldn't:

using System;
using System.Text.RegularExpressions;

public class Program
{
    static void Main(string[] args)
    {
        foreach (string email in new string[]{
            "l3tt3rsAndNumb3rs@domain.com",
            "has-dash@domain.com",
            "hasApostrophe.o'leary@domain.org",
            "uncommonTLD@domain.museum",
            "uncommonTLD@domain.travel",
            "uncommonTLD@domain.mobi",
            "countryCodeTLD@domain.uk",
            "countryCodeTLD@domain.rw",
            "lettersInDomain@911.com",
            "underscore_inLocal@domain.net",
            "IPInsteadOfDomain@127.0.0.1",
            "IPAndPort@127.0.0.1:25",
            "subdomain@sub.domain.com",
            "local@dash-inDomain.com",
            "dot.inLocal@foo.com",
            "a@singleLetterLocal.org",
            "singleLetterDomain@x.org",
            "&*=?^+{}'~@validCharsInLocal.net",
            "missingDomain@.com",
            "@missingLocal.org",
            "missingatSign.net"
        })
        {
            string s = @"^[-_a-z0-9\'+*$^&%=~!?{}]+(?:\.[-_a-z0-9\'+*$^&%=~!?{}]+)*@(?:(?![-.])[-a-z0-9.]+(?<![-.])\.[a-z]{2,6}|\d{1,3}(?:\.\d{1,3}){3})(?::\d+)?$";
            bool isMatch = Regex.IsMatch(email, s, RegexOptions.IgnoreCase);
            Console.WriteLine(isMatch);
        }
    }
}

Output:

True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
False
False
False

A problem though is that it fails to match some valid email-addresses, such as foo\@bar@example.com. It's better too match too much than too little.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 1
    Thanks for your attempt, but this rejects all but two of the valid emails on this site: http://fightingforalostcause.net/misc/2006/compare-email-regex.php, so the conversion is not equivalent. – devuxer Jan 10 '10 at 02:53
  • Did you remember to use case insensitive matching? Only two of those emails have only lower case, and I'm guessing those were the two you matched successfully. I've added some code to show how the regular expression should be used with the IgnoreCase option. – Mark Byers Jan 10 '10 at 03:05
  • With `RegexOptions.IgnoreCase`, it only accepts has-dash@domain.com and subdomain@sub.domain.com. All the other valid emails are rejected. – devuxer Jan 10 '10 at 04:08
  • I've checked it myself. It works for all the emails given! You must have an error in your code. I'll provide more source code. – Mark Byers Jan 10 '10 at 10:33
  • @Mark, my apologies. Just checked my code this morning, and I had put the `IgnoreCase` in one part of my code but not the other. After fixing this, the results look *much* better. It accepts all the valid emails, and the only invalid ones it accepts are TLDDoesntExist@domain.moc and local@SecondLevelDomainNamesAreInvalidIfTheyAreLongerThan64Charactersss.org. – devuxer Jan 10 '10 at 18:46
4

You really shouldn't be using a RegEx to parse email addresses in .NET. Your better option is to use the functionality built into the framework.

Try to use your email string in the constructor of the MailAddress class. If it throws a FormatException then the address is no good.

try 
{
    MailAddress addr = new MailAddress("theEmail@stackoverflow.com")
    // <- Valid email if this line is reached
}
catch (FormatException)
{
    // <- Invalid email if this line is reached
}

You can see an answer a Microsoft developer gave to another email validation question, where he explains how .NET's email parsing has also improved dramatically in .NET 4.0. Since at the time of answering this, .NET 4.0 is still in beta, you probably aren't running it, however even previous versions of the framework have adequate email address parsing code. Remember, in the end you're most likely going to be using the MailAddress class to send your email anyway. Why not use it to validation your email addresses. In the end, being valid to the MailAddress class is all that matters anyway.

Community
  • 1
  • 1
Dan Herbert
  • 99,428
  • 48
  • 189
  • 219
  • Testing on .NET 3.5, it accepts quite a few invalid email addresses: missingDomain@.com, missingDot@com, someone-else@127.0.0.1.26, domainStartsWithDash@-domain.com, domainEndsWithDash@domain-.com, TLDDoesntExist@domain.moc, numbersInTLD@domain.c0m, and local@SecondLevelDomainNamesAreInvalidIfTheyAreLongerThan64Charactersss.org. That said, you make a good point about using `MailAddress` to send the emails anyway. And maybe .NET 4.0 does a better job. – devuxer Jan 10 '10 at 06:07
  • @DanThMan Those addresses aren't necessarily invalid. Most of those addresses you mentioned are possible on internal networks, and therefore can potentially be valid. Remember, it's better to allow invalid addresses than reject valid ones. – Dan Herbert Jan 10 '10 at 15:58
  • @Dan, another good point, but for this particular application, I'm definitely expecting internet email addresses (rather than intranet). At the moment, though, I think I like this solution the best. I will actually need to send out emails, so using `MainAddress` to validate seems like the way to go. – devuxer Jan 10 '10 at 18:51
  • I concur with DH on avoiding the use of RegEx to do email address validation. Let the .NET Framework do this validation; doing so reduces coding effort and yields a more elegant email validation solution. – JohnH Apr 21 '15 at 15:05
1

.NET regular expression syntax is not the same as in PHP, and Regex is the only built-in class to use regular expression (but there might be other third party implementation). Anyway, it's pretty easy to validate an email address with Regex... straight from the source

^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$
Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
  • This fails several of the tests on this site: http://fightingforalostcause.net/misc/2006/compare-email-regex.php – devuxer Jan 10 '10 at 02:47
0

I've used this function before in a bunch of e-commerce applications and never had a problem.

    public static bool IsEmailValid(string emailAddress)
    {
        Regex emailRegEx = new Regex(@"\b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b");
        if (emailRegEx.IsMatch(emailAddress))
        {
            return true;
        }

        return false;
    }