3

Does anyone have a php class, or regex to parse an address into components? At least, it should break up into these components: street info, state, zip, country

Matt
  • 22,721
  • 17
  • 71
  • 112
Andres
  • 5,002
  • 6
  • 31
  • 34

7 Answers7

7

A library/language agnostic solution would be to use Google's geocoder for this. It can return detailed, broken-down information about a given address.

http://code.google.com/apis/maps/documentation/services.html#Geocoding_Structured

Samantha Branham
  • 7,350
  • 2
  • 32
  • 44
  • +1 Looks like a good resource. keep in mind, though, that it may allow parsing -partial- addresses, such that if you just give it only country and a state, it may be fine with that kind of partial information, whereas if you're building an app that needs to use mailing addresses down the line, that kind of open-ended allowance can come back to bite you. *shrugs* – Kzqai Nov 16 '09 at 21:43
  • Couldn't you just validate against the results of the parse? That is, if you don't mind potentially hitting Google a lot. – Samantha Branham Nov 17 '09 at 06:16
  • there are limits if you have a large list, then you'll have to pay for google geocode... there are cheaper solutions like heremaps or bing maps for bulk. – Dawesi Apr 06 '21 at 05:47
3

Use this just as an example if your data is all formatted very similarly. As Strager pointed out, in most cases there will be too much variation in data to use a regex effectively.

Assuming your input is of the format:

[Street Name], [State], [ZIP], [Country]

This regex will do the trick:

m/^(.+?),(.+?),([0-9]+),(.+)$/

But regular expressions are fairly complex. If you are going to use this for anything significant, I would suggest taking the time to learn them. I have always found the "Regular Expressions Cheat Sheet" very useful.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Vlad the Impala
  • 15,572
  • 16
  • 81
  • 124
  • 5
    Due to the many possible forms for addresses, I don't think a regular expression is feasible. – strager Nov 16 '09 at 02:49
  • +1 for the regex cheat sheet, useful, though I don't think that regex is going to be a great solution to an open address field. – Kzqai Nov 16 '09 at 21:23
2

If you're talking about pre-existing data, good luck to ye. If this is something that you have control over the input for, I recommend creating separation of the different parts of the address at the input level. Jus' a suggestion.

Kzqai
  • 22,588
  • 25
  • 105
  • 137
  • I tried to keep the interface as simple as possible. I wanted to give the users the ability to simply enter the address in a textarea and I would later try to parse it. – Andres Nov 16 '09 at 02:48
  • 2
    Keep in mind that by providing only an open-ended field, you're actually making it -more- complex, because you aren't specifying any strong requirements for the input, so the users -will- put in information in formats that you -won't- be able to parse. Better to provide separated fields and then provide an "other address info" field for addresses that might not fit your pattern. – Kzqai Nov 16 '09 at 21:39
1

Here is a Python version using pyparsing for parsing street addresses. It's not PHP, but might give you some insights into the complexity of the problem.

PaulMcG
  • 62,419
  • 16
  • 94
  • 130
1

The issue is that addresses themselves come in all shapes and sizes and they are not self-validating entities. This means that there is no way to really know if you did it right without inspecting the address by hand (and even then it can be error prone) or by using some kind of address verification software--be it desktop-based software or online.

There are a number of address verification web services that can take an address and break it into its component parts and do so in a safe manner where the results have been certified to be valid.

I should mention that I'm the founder of SmartyStreets. We do address verification which includes the capabilities that you have asked about for US-based addresses. Our flagship product is US Address API which is an address verification web service API.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Jonathan Oliver
  • 5,207
  • 32
  • 31
  • As good as this commercial SmartyStreets offer sounds, beware - it is for USA only. – Artur Bodera Jan 13 '14 at 10:50
  • Right on our homepage is big type we say "USPS address validation". Here's our FAQ page about it: http://smartystreets.com/kb/faq/do-you-verify-international-addresses Here's Google's results when searching our site for "United States" https://www.google.com/search?q=site%3Asmartystreets.com+united+states – Jonathan Oliver Jan 13 '14 at 13:36
  • Fair enough, deleted my comment on disqus, but the argument still stands that due to this limitation SStreets is only a partial solution to a general problem of address verification brought up in this question. – Artur Bodera Jan 13 '14 at 14:39
  • 1
    @ArturBodera My answer stands independent of the organization I represent. The OP question asks about using a regex to parse an address into its component parts. My answer is that an address cannot be inherently self validating and therefore must use an outside authority. Nonetheless, for clarity I have also updated my answer to indicate we are currently a US-only system. – Jonathan Oliver Jan 14 '14 at 03:34
0

How about this one,

http://www.analysisandsolutions.com/software/addr/

ZZ Coder
  • 74,484
  • 29
  • 137
  • 169
  • 1
    This is cool software, but doesn't seem to do what the asker wanted. Instead of giving the user components of an address, it standardizes the address. Useful, but not what was asked. – mooreds Mar 22 '11 at 16:22
0

I've found a php address parser that was designed for Poland but might work elsewhere with some modification:

PHP address parser

McDowell
  • 107,573
  • 31
  • 204
  • 267
PowerAktar
  • 2,341
  • 1
  • 21
  • 17