1

I'm trying to validate the format of a street address in Google Forms using regex. I won't be able to confirm it's a real address, but I would like to at least validate that the string is:

[numbers(max 6 digits)] [word(minimum one to max 8 words with spaces in between and numbers and # allowed)], [words(minimum one to max four words, only letters)], [2 capital letters] [5 digit number]

I want the spaces and commas I left in between the brackets to be required, exactly where I put them in the above example. This would validate

123 test st, test city, TT 12345

That's obviously not a real address, but at least it requires the entry of the correct format. The data is coming from people answering a question on a form, so it will always be just an address, no names. Plus they're all address is one area South Florida, where pretty much all addresses will match this format. The problem I'm having is people not entering a city, or commas, so I want to give them an error if they don't. So far, I've found this

^([0-9a-zA-Z]+)(,\s*[0-9a-zA-Z]+)*$

But that doesn't allow for multiple words between the commas, or the capital letters and numbers for zip. Any help would save me a lot of headaches, and I would greatly appreciate it.

Drop04
  • 11
  • 1
  • 1
  • 4
  • How many characters/digits in each word? – Matt.G May 25 '18 at 14:10
  • This topic has come up before and regex is not the way to go. https://stackoverflow.com/questions/11160192/how-to-parse-freeform-street-postal-address-out-of-text-and-into-components – Ervin Ruci May 25 '18 at 15:14
  • I did see the other posts about this, but I have a pretty narrow use for this and the answer Matt. G provided seems to work perfectly for me. I won't have issues like PO boxes and weird formats, which others have to deal with, since they probably have a much larger data set. – Drop04 May 26 '18 at 14:30

3 Answers3

2

Try Regex:

\d{1,6}\s(?:[A-Za-z0-9#]+\s){0,7}(?:[A-Za-z0-9#]+,)\s*(?:[A-Za-z]+\s){0,3}(?:[A-Za-z]+,)\s*[A-Z]{2}\s*\d{5}

See Demo

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Matt.G
  • 3,586
  • 2
  • 10
  • 23
  • It does not match addresses like: 751 FAIR OKS AVENUE PASADNA CA https://regex101.com/r/PuWFdz/1 – Ervin Ruci May 25 '18 at 15:09
  • @ErvinRuci, OP's requirement has mandatory commas, spaces and zipcode – Matt.G May 25 '18 at 15:28
  • I think that's it! We'll see, I'm sure someone will find a way to break it, but this is an excellent starting point that I can tweak, if necessary. Thank you! – Drop04 May 25 '18 at 18:50
2

There really is a lot to consider when dealing with a street address--more than you can meaningfully deal with using a regular expression. Besides, if a human being is at a keyboard, there's always a high likelihood of typing mistakes, and there just isn't a regex that can account for all possible human errors.

Also, depending on what you intend to do with the address once you receive it, there's all sorts of helpful information you might need that you wouldn't get just from splitting the rough address components with a regex.

As a software developer at SmartyStreets (disclosure), I've learned that regular expressions really are the wrong tool for this job because addresses aren't as 'regular' (standardized) as you might think. There are more rigorous validation tools available, even plugins you can install on your web form to validate the address as it is typed, and which return a wealth of of useful metadata and information.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Michael Whatcott
  • 5,603
  • 6
  • 36
  • 50
0

Accepts Apt# also:

(^[0-9]{1,5}\s)([A-Za-z]{1,}(\#\s|\s\#|\s\#\s|\s)){1,5}([A-Za-z]{1,}\,|[0-9]{1,}\,)(\s[a-zA-Z]{1,}\,|[a-zA-Z]{1,}\,)(\s[a-zA-Z]{2}\s|[a-zA-Z]{2}\s)([0-9]{5})
Nimantha
  • 6,405
  • 6
  • 28
  • 69