6

I'm trying to see what would be a good way to validate a US address, I know that there might be not a proper way of doing this, but I'm going for the basic way: #, Street name, City, State, and Zip Code.

Any ideas will be appreciate it. Thanks

Juha Syrjälä
  • 33,425
  • 31
  • 131
  • 183
  • What part of the address are you trying to validate with a Regex? The whole thing, the zip code? – Gavin Miller Sep 04 '09 at 19:28
  • as I said previously, I'm trying to see if the string starts with a number (any size), followed by letters(street name)&(city), and two letters state code. –  Sep 04 '09 at 19:40
  • 1
    The problem with what you just mentioned is that addresses don't have to start with a number (i.e. One Microsoft Way, PO Boxes) and the street name doesn't have to have letters (highways, numbered streets). – jimyi Sep 04 '09 at 19:47
  • 1
    Thanks, I'm aware of that, but as I mentioned, I'm heading for the "common basic format". –  Sep 04 '09 at 19:53

6 Answers6

5

Don't try. Somebody is likely to have a post office box, or an apartment number etc., and they will be really irate with you. Even a "normal" street name can have numbers, like 125th Street (and many others) in New York City. Even a suburb can have some numbered streets.

And city names can have spaces.

Yacoby
  • 54,544
  • 15
  • 116
  • 120
Robert L
  • 1,963
  • 2
  • 13
  • 11
  • I lived on a street named "East South Boulder Road". That is, the eastern portion of the street named "South Boulder Road". This was great fun to explain to people asking for my address. – Commodore Jaeger Sep 05 '09 at 04:16
  • 5
    In Australia there is a *town* called 1770. And to answer you next question, no, the postcode is 4677. – too much php Sep 08 '09 at 03:24
  • These cases still should be solvable with a sophisticated Regex. – Acyra Feb 28 '13 at 19:26
4

Ask the user to enter parts of the address in separate fields (Street name, City, State, and Zip Code) and use whatever validation appropriate for such a field. This is the general practice.

Alternatively, if you want simplest of regex that matches for four strings separated by three commas, try this:

/^(.+),([^,]+),([^,]+),([^,]+)$/

If things match, you can use additional pattern matching to check components of the address. There is no possible way to check the street address validity but you might be able to text postal codes and state codes.

Salman A
  • 262,204
  • 82
  • 430
  • 521
2

There are way too many variations in address to be able to do this using regular expressions. You're better off finding a web service that can validate addresses. USPS has one - you'll have to request permission to use it.

jimyi
  • 30,733
  • 3
  • 38
  • 34
  • Actually, it's not the variations, but the fact that a mailing address is not regular. At least, I don't think it is. If someone can prove that it is, please do so - I'd be interested. – Thomas Owens Sep 04 '09 at 19:32
2

I agree with salman: have user enter the data in different fields (one for zip, one for state, one for city, and one for the #/street name. Use a different regex for each field. For the street #/name the best expression i came up with was

/^[0-9]{1,7} [a-zA-z0-9]{2,35}\a*/
rajah9
  • 11,645
  • 5
  • 44
  • 57
rich
  • 21
  • 1
  • Would that regex work with compound names or with a ST/DR/AVE afterwards? I'm thinking of "1234 Old Statesville Rd" or "123 Main ST". The street name part doesn't take spaces. – rajah9 Nov 28 '12 at 14:39
0

This is not a bulletproof solution but the assumption is that an address begins with a numeric for the street number and ends with a zip code which can either be 5 or 9 numbers.

([0-9]{1,} [\s\S]*? [0-9]{5}(?:-[0-9]{4})?)

Like I said, it's not bulletproof, but I've used it with marginal success in the past.

JasonBartholme
  • 132
  • 1
  • 2
  • 9
0

Over here in New Zealand, you can license the official list of postal addresses from New Zealand Post - giving you the data needed to populate a table with every valid postal address in New Zealand.

Validating against this list is a whole lot easier than trying to come up with a Regex - and the accuracy is much much higher as well, as you end up with three cases:

  • The address you're validating is in the list, so you know it is a real address
  • The address you're validating is very similar to one in the list, so you know it is probably a real address
  • The address you're validating is not similar in the list, so it may or may not be real.

The best you'll get with a RegEx is

  • The address you're validating matches the regex, so it might be a real address
  • The address you're validating does not match the regex, so it might not be a real address

Needing to know postal addresses is a pretty common situation for many businesses, so I believe that licensing a list will be possible in most areas.

The only sticky bit will be pricing.

Bevan
  • 43,618
  • 10
  • 81
  • 133