3

I have an ASP.NET/C# application that has a form field that asks the user for their location, which we take and pass to Bing Maps for geocoding purposes. For some reason my client wants to limit input to these three formats:

San Francisco, CA 91111
San Francisco, CA
91111

However, I know they'll also end up asking for support for Canadian postal codes as well.

Of course, they ask for this 2 hours before the launch of the project so I don't really have too much time to research regex myself and figure it out (I'm terrible at regex) so I figured I'd ask here.

Can anyone come up a RegEx that I can use to validate that it fits one of the above three formats, with support for canadian postal codes as well (doesn't have to support ZIP+4).

Scott
  • 13,735
  • 20
  • 94
  • 152

7 Answers7

12

I tried this and it seems to work for all the cases you specified:

var pattern =
    @"
    (^[\w\s]+,\s\w{2}$)|                        # City, State
    (^[\w\s]+,\s\w{2}\s\d{5}$)|                 # City, State and US PostCode
    (^[\w\s]+,\s\w{2}\s(\w\d\w\s?\d\w\d)$)|     # City, State and Canada PostCode
    (^\d{5}$)|                                  # US PostCode
    (^\w\d\w\s?\d\w\d$)                         # Canada PostCode";

When using this regex, make sure you either:

  • specify RegexOptions.IgnorePatternWhitespace

or

  • use the condensed (less readable) version: (^[\w\s]+,\s\w{2}$)|(^[\w\s]+,\s\w{2}\s\d{5}$)|(^[\w\s]+,\s\w{2}\s(\w\d\w\s?\d\w\d)$)|(^\d{5}$)|(^\w\d\w\s?\d\w\d$)
Cristian Lupascu
  • 39,078
  • 16
  • 100
  • 137
  • 3
    According to this earlier post(https://stackoverflow.com/a/1146231/6009304) regrading a Canadian Postal Code Regex, "Canadian postal codes can't contain the letters D, F, I, O, Q, or U, and cannot start with W or Z". There are regexs provided in the linked post. – Tyler Oct 18 '17 at 16:57
2

To match Canadian or US postal codes, you can use ^\d{5}(-\d{4})?$)|(^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$. Since you don't really need city and state when the postal code is present, you can ignore the rest of the input when the regex matches. So put that regex in a capturing group and extract it. For example:

Regex postalCodeRegex = new Regex("^.*(\d{5}(-\d{4})?$)|(^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}).*$"
             , RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.CultureInvariant);

Match m = postalCodeRegex.match(userInput);
if(m.Success) 
{
    String postalCode = m.Groups[1].Value;
    // Set location based on postal code
}
else 
{
    // Set location based on city
}
Diego
  • 18,035
  • 5
  • 62
  • 66
2

Assuming C# uses PCRE:

Match at least one alpha character and allow spaces and dashes for the city

[A-Za-z\s\-]+

Followed by a comma, a two character state code,

,\s?[A-Za-z]{2}

followed by a space and either a 5 digit number or a 6 character alpha string.

\s(\d{5}|[A-Za-z0-9]{3}\s?[A-Za-z0-9]{3})

So for the first example combine everything. For your second example combine the first 2. for your third example remove the leading \s off the last part.

EDIT: found out sometimes there is a space in Canadian zip codes. Added to support that.

Cfreak
  • 19,191
  • 6
  • 49
  • 60
0

This has taken a lot of work, but this will validate most versions of city state zip and city state. We use this in production for address validation in the millions, so it is pretty solid.

((?:\w|\s|\w\.)+),?\s(?i:AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY|Alabama|Alaska|Arizona|Arkansas|California|Colorado|Connecticut|Delaware|District of Columbia|Florida|Georgia|Hawaii|Idaho|Illinois|Indiana|Iowa|Kansas|Kentucky|Louisiana|Maine|Maryland|Massachusetts|Michigan|Minnesota|Mississippi|Missouri|Montana|Nebraska|Nevada|New Hampshire|New Jersey|New Mexico|New York|North Carolina|North Dakota|Ohio|Oklahoma|Oregon|Pennsylvania|Rhode Island|South Carolina|South Dakota|Tennessee|Texas|Utah|Vermont|Virginia|Washington|West Virginia|Wisconsin|Wyoming)(|.(\d{5}(-\d{4}|\d{4}|$)))$
VinnieS
  • 85
  • 2
  • 12
0

Not sure if it's the best regex for this, but try:

([\D]+)? ([\D]+)?([\d]+)?

EDITED:

([\D]+)? ([\D]+)?([\d]+)?([\d\D]+){2}
Bogdan Emil Mariesan
  • 5,529
  • 2
  • 33
  • 57
0

I'm unsure of the exact specs you are asking for, but you could use an expression like this to match strings of the formats in your examples:

var re = @"(?xi)^\s*
    (?:
       [a-z][^,]+ , \s+ [a-z]{2}   
       (?: \s+ \d{5} )?            # optional postal code
    |
        \d{5}                      # postal code
    |
        [a-z]\d[a-z]\s*\d[a-z]\d   # canadian code
    )
    \s*$";
Qtax
  • 33,241
  • 9
  • 83
  • 121
0

OK. Not being a regex-pert myself, I tend to split the problem down into smaller regex's and then use them.

Therefore, City and State would be:

([a-zA-Z ]+, [a-zA-z ]+)

US Zip code would be

(\d{5})

Canadian ZIP code would be:

([a-zA-Z]\d[a-zA-Z] ?\d[a-zA-Z]\d)

So ZIP codes would be:

((\d{5})|([a-zA-Z]\d[a-zA-Z] ?\d[a-zA-Z]\d))

Putting them altogether gives us:

(([a-zA-Z ]+, [a-zA-z]+) ((\d{5})|([a-zA-Z]\d[a-zA-Z] ?\d[a-zA-Z]\d))?|((\d{5})|([a-zA-Z]\d[a-zA-Z] ?\d[a-zA-Z]\d)))

(City and State followed by an optional ZIP, or a ZIP on its own)

I'm sure that there are easier ways to do the letters but I'm waiting for a job to finish and thought I would put my two-pennyworth in

Hope this help

DaveyBoy
  • 2,928
  • 2
  • 17
  • 27