0

Does anyone have a recommended pseudo-algorithm for, given a string containing an address:

Break apart the address apart into a Street variable, a City variable, a State variable, and a Zip variable

The address string may be formatted in a number of different ways. For example, it may be comma separated or it may be separated by spaces. Also, the address may only contain a city and state, and not a street address or zip code. Similarly, it may contain a street, city, state, and not a zip code.

To make things harder, I cannot use regular expressions (as I am developing on a mobile platform that does not support it).

Thanks!

Matt
  • 22,721
  • 17
  • 71
  • 112
littleK
  • 19,521
  • 30
  • 128
  • 188
  • I'm confused... "Break apart the address apart into a Street variable, a City variable, a State variable, and a Zip variable" and "Also, the address may only contain...not a street address or zip code" seem to contradict each other. – Jonathan Oct 27 '10 at 19:19
  • Sorry for the confusion. If the address does not contain a particular element (city, state, or zip code), then it should not be broken apart (as there is nothing to break apart). I'm basically using if statements to generate XML. So, if (city, state, and zip exist){ then form XML with those elements } else { form other XML } – littleK Oct 27 '10 at 19:22
  • So address = city + state OR address = street + city + state? – orangepips Oct 27 '10 at 19:24
  • Also, the state can be spelled out or abbreviated (to make things harder) – littleK Oct 27 '10 at 19:33

2 Answers2

0

Here is a cool solution using Google Maps provided by John. May be you want to use that :

Java postal address parser

Community
  • 1
  • 1
Ramp
  • 1,772
  • 18
  • 24
  • Thanks for the link, Ramp, but unfortunately I cannot use any external web services. I need to do everything locally in my Java code. But Google is definitely awesome at formatting such information... – littleK Oct 27 '10 at 19:29
0

No, but look at JGeocoder:

http://jgeocoder.sourceforge.net/parser.html

They split addresses up into their constituent parts. You could take a look at the source for that...

Jonathan Holloway
  • 62,090
  • 32
  • 125
  • 150
  • I was just looking at that as we speak, the only problem is that it does use regular expressions. But perhaps I can still use some of its logic. – littleK Oct 27 '10 at 19:47