95

Is it even possible to perform address (physical, not e-mail) validation? It seems like the sheer number of address formats, even in the US alone, would make this a fairly difficult task. On the other hand it seems like a task that would be necessary for several business requirements.

John Smith
  • 7,243
  • 6
  • 49
  • 61
Kevin Pang
  • 41,172
  • 38
  • 121
  • 173
  • 23
    whatever you do, if the adress seems invalid, let the user select that it actually is valid, don't just reject it because the validator says it's invalid. – Pim Jager Jul 23 '09 at 16:19
  • Do you mean "address autocomplete" or "address validation"? I mean no service will ever know all the changes happening in real world. Every minute millions of new addresses appear and old ones being demolished. And I'm talking locally now, in a scope of a planet. – Yevgeniy Afanasyev Jun 22 '18 at 07:35

20 Answers20

28

Here's a free and sort of "outside the box" way to do it. Not 100% perfect, but it should reject blatantly non-existent addresses.

Submit the entire address to Google's geocoding web service. This service attempts to return the exact coordinates of the location you feed it, i.e. latitude and longitude.

In my experience if the address is invalid you will get a result of 602 from the service. There's definitely a possibility of false positives or false negatives, but used in conjunction with other consistency checks it could be useful.

(Yahoo's geocoding web service, on the other hand, will return the coordinates of the center of the town if the town exists but the rest of the address is bogus. Potentially useful as long as you pay close attention to the "precision" field in the result).

CRDave
  • 9,279
  • 5
  • 41
  • 59
Tim Farley
  • 11,720
  • 4
  • 29
  • 30
  • 4
    Is there anyone who actually used this method on a live system? – Aston Aug 04 '11 at 14:25
  • If the page is behind a login this won't work. Google will block all requests once it finds the referring page is unreachable. – Chris Moschini Nov 03 '11 at 15:54
  • 27
    Google and Yahoo's TOS forbid using their geocoding service except in conjunction with the use of a map displayed to the user. Find something more like what Jonathan Oliver has suggested. See: http://code.google.com/apis/maps/terms.html – Matt Feb 12 '12 at 01:05
  • 1
    I'm not getting the said 602 results when I enter a bad address. – HK1 May 24 '12 at 16:03
  • Use the USPS API. It'll return the standardized address that will make it through the mail quicker. – Nick Mar 30 '15 at 22:37
  • 10
    This really should NOT be the selected answer. It violates Googles TOS, and it is actually not really validating addresses. Google uses a lot of intelligence to "guess" what address you wanted even when information is slightly off. Google will also return a geocode if you say enter 55 Main St and google knows there is a 50 Main st and a 60 main St. it will return back a successful geocode averaging out the distance, even if a 55 Main St doesn't actually exist. You need to use a real validating service like USPS preferably or Experian or SmartyStreets. – jacurtis Jun 15 '17 at 00:31
  • 1
    This would make a wonderful Blockchain application where everyone contributes with a node by storing bits and pieces of an immense ever changing database... – buycanna.io Jun 17 '18 at 00:41
  • 1
    Possibly the TOS has been updated but it seems to say that you can't use it in non-google maps. That is different than using the geocodes without a map. Just don't get their geocodes and then load them into a Bing map. "(d) No Use With Non-Google Maps. Customer will not use the Google Maps Core Services in a Customer Application that contains a non-Google map. For example, Customer will not (i) display Places listings on a non-Google map, or (ii) display Street View imagery and non-Google maps in the same Customer Application." – wuliwong Nov 14 '18 at 20:54
22

There are a number of good answers in here but most of them make the assumption that the user wants an "API" solution where they must write code to connect to a 3rd-party service and/or screen scrape the USPS. This is all well and good, but should be factored into the business requirements and costs associated with the implementation and then weighed against the desired benefits.

Depending upon the business requirements and the way that the data is received into the system, a real-time address processing solution may be the best bet. If a real-time solution is required, you will want to consider the license agreement and technical limitations of the Google Maps/Bing/Yahoo APIs. They typically limit the number of calls you can make each day. The USPS web tools API is the same in additional they restrict how/why you can use their system and how you are allowed to use the data thereafter.

At the same time, there are a handful of great service providers that can easily process a static list of addresses. Essentially, you give the service provider a CSV file or Excel file, they clean it up and get it back to you. It's a one-time deal with no long-term commitment or obligation—usually.

Full disclosure: I'm the founder of SmartyStreets. We do address verification for addresses within the United States. We are easily able to CASS certify a list and we also offer a address verification web service API. We have no hidden fees, contracts, or anything. You use our service until you no longer need it and you can walk away. (Unlike cell phone companies that require a contract.)

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Jonathan Oliver
  • 5,207
  • 32
  • 31
  • 1
    Perfect, just what I was looking for. And let me say, that you have style, sir. Nice looking site. – Cory Mawhorter Mar 10 '13 at 20:11
  • 2
    We use SmartyStreets and it's awesome. – JerSchneid Oct 17 '13 at 19:03
  • 19
    SmartyStreet's cost for 1,000 lookups is $30. Saved you a click. – William Entriken Feb 15 '16 at 15:50
  • 1
    1,000 lookups = 66$ for international – Robert Sinclair Mar 09 '17 at 18:46
  • 1
    Actually the USPS has a REST interface for this... however, and this is a HUGE however... if you hit this thing with a script to validate a database full of addresses they _will_ ban you. You're allowed to validate user data entry with it so it's fine for use on a web form where the API hits will be randomly timed. Just make sure your form has bot protection or you'll get your API access DoS'd in short order. Also this is only good for US addresses. – Neil Davis Jun 28 '19 at 21:35
15

USPS has an address cleaner online, which someone has screen scraped into a poor man's webservice. However, if you're doing this often enough, it'd be a better idea to apply for a USPS account and call their own webservice.

iandisme
  • 6,346
  • 6
  • 44
  • 63
Mark Brackett
  • 84,552
  • 17
  • 108
  • 152
  • 17
    To save anyone the heartburn from doing that and getting a rejection letter three days later, almost no one is really allowed to use USPS's own address validator. That's why the guy screen scraped it in the first place. It's strictly limited to non-profit organizations. – Nicholas Piasecki Aug 06 '09 at 19:23
  • 17
    The USPS will allow commercial use IFF you're not re-selling or cleansing your databases. We qualified for an account and use it to verify/fix ALL the addresses that come in via our web forms. – marklark Oct 21 '11 at 16:16
  • 4
    You can only use the USPS' API if you are verifying an address to perform a mailing or do shipments: https://secure.shippingapis.com/registration/ -- if you don't do mailings or ship through the USPS, then find an alternative API that doesn't have the license restrictions. – Matt Feb 12 '12 at 01:07
  • 5
    The USPS does not allow this data to be used for anything other than mailing a letter or package. In other words, you must mail to the address you lookup. –  Dec 02 '10 at 17:29
  • 2
    Or intend to mail to at some point in time. My company were initially denied access to the USPS API but, after clarifying our intended use (and stating that we wouldn't use the API to fix old data), we were approved. – marklark Oct 21 '11 at 16:20
10

I will refer you to my blog post - A lesson in address storage, I go into some of the techniques and algorithms used in the process of address validation. My key thought is "Don't be lazy with address storage, it will cause you nothing but headaches in the future!"

Also, there is another StackOverflow question that asks this question entitled How should international geographic addresses be stored in a relational database.

John Smith
  • 7,243
  • 6
  • 49
  • 61
BenAlabaster
  • 39,070
  • 21
  • 110
  • 151
8

In the course of developing an in-house address verification service at a German company I used to work for I've come across a number of ways to tackle this issue. I'll do my best to sum up my findings below:

Free, Open Source Software

Clearly, the first approach anyone would take is an open-source one (like openstreetmap.org), which is never a bad idea. But whether or not you can really put this to good and reliable use depends very much on how much you need to rely on the results.

Addresses are an incredibly variable thing. Verifying U.S. addresses is not an easy task, but bearable, but once you're going for Europe, especially the U.K. with their extensive Postal Code system, the open-source approach will simply lack data.

Web Services / APIs

Enterprise-Class Software

Money gets it done, obviously. But not every business or developer can spend ~$0.15 per address lookup (that's $150 for 1,000 API requests) - a very expensive business model the vast majority of address validation APIs have implemented.

What I ended up integrating: streetlayer API

Since I was not willing to take on the programmatic approach of verifying address data manually I finally came to the conclusion that I was in need of an API with a price tag that would not make my boss want to fire me and still deliver solid and reliable international verification results.

Long story short, I ended up integrating an API built by apilayer, called "streetlayer API". I was easily convinced by a simple JSON integration, surprisingly accurate validation results and their developer-friendly pricing. Also, 100 requests/month are entirely free.

Hope this helps!

Frank
  • 614
  • 1
  • 8
  • 31
  • good answer, unfortunately streetlayer API is shutting down next week (on 2017-07-03). any alternatives out there for less than 15 cent per request? – Chris Jun 30 '17 at 15:43
1

As seen on reddit:

$address = urlencode('1600 Pennsylvania Avenue, Washington, DC');
$json = json_decode(file_get_contents("http://where.yahooapis.com/geocode?q=$address&flags=J"));
print_r($json);
Xeoncross
  • 55,620
  • 80
  • 262
  • 364
1

For us-based address data my company has used GeoStan. It has bindings for C and Java (and we created a Perl binding). Note that it is a commercial product and isn't cheap. It is quite fast though (~300 addresses per second) and offers features like CASS certification (USPS bulk mail discount), DPV (Delivery point verification) flagging, and LON/LAT geocoding.

There is a Perl module Geo::PostalAddress, but it uses heuristics and doesn't have the other features mentioned for GeoStan.

Edit: some have mentioned 'doing it yourself', if you do decide to do this, a good source of information to start with is the US Census Tiger Data Set, which contains a lot of information about the US including address information.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Kyle Burton
  • 26,788
  • 9
  • 50
  • 60
1

I have used the services of http://www.melissadata.com Their "address object" works very well. Its pricey, yes. But when you consider costs of writing your own solutions, the cost of dirty data in your application, returned mailers - lost sales, and the like - the costs can be justified.

Taptronic
  • 5,129
  • 9
  • 44
  • 59
1

Fixaddress.com service is available that provides following services,

1) Address Validation.

2) Address Correction.

3) Address spell correcting.

4) Correct addresses phonetic mistakes.

Fixaddress.com uses USPS and Tiger data as reference data.

For more detail visit below link,

http://www.fixaddress.com/

Rakesh Chaudhari
  • 3,310
  • 1
  • 27
  • 25
1

One area where address lookups have to be performed reliably is for VOIP E911 services. I know companies reliably using the following services for this:

Bandwidth.com 9-1-1 Access API MSAG Address Validation

MSAG = Master Street Address Guide

https://www.bandwidth.com/9-1-1/

SmartyStreet US Street Address API

https://smartystreets.com/docs/cloud/us-street-api

Grokify
  • 15,092
  • 6
  • 60
  • 81
0

Yahoo has also a Placemaker API. It is good only for locations but it has an universal id for all world locations.

It look that there is no standard in ISO list.

Elzo Valugi
  • 27,240
  • 15
  • 95
  • 114
0

There are companies that provide this service. Service bureaus that deal with mass mailing will scrub an entire mailing list to that it's in the proper format, which results in a discount on postage. The USPS sells databases of address information that can be used to develop custom solutions. They also have lists of approved vendors who provide this kind of software and service.

There are some (but not many) packages that have APIs for hooking address validation into your software.

However, you're right that its a pretty nasty problem.

http://www.usps.com/ncsc/ziplookup/vendorslicensees.htm

Jason
  • 86,222
  • 15
  • 131
  • 146
0

As mentioned there are many services out there, if you are looking to truly validate the entire address then I highly recommend going with a Web Service type service to ensure that changes can quickly be recognized by your application.

In addition to the services listed above, webservice.net has this US Address Validation service. http://www.webservicex.net/WCF/ServiceDetails.aspx?SID=24

Mitchel Sellers
  • 62,228
  • 14
  • 110
  • 173
0

We have had success with Perfect Address.

Their database has all the US street names and street number ranges. Also acts as a pretty decent parser for free-form address fields, if you are lucky enough to have that kind of data.

Jason DeFontes
  • 2,235
  • 15
  • 14
  • Very interesting... their ancient looking website makes me wonder if they are even still in business or reliable though? – Anthony Griggs Jun 20 '19 at 13:41
  • It's been a while since I was at the company where we used that. Doesn't looked like they've changed much, but they do still claim to have current data. – Jason DeFontes Aug 14 '19 at 15:57
0

Validating it is a valid address is one thing.

But if you're trying to validate a given person lives at a given address, your only almost-guarantee would be a test mail to the address, and even that is not certain if the person is organised or knows somebody at that address.

Otherwise people could just specify an arbitrary random address which they know exists and it would mean nothing to you.

The best you can do for immediate results is request the user send a photographed / scanned copy of the head of their bank statement or some other proof-of-recent-residence, because at least then they have to work harder to forget it, and forging said things show up easily with a basic level of image forensic analysis.

Kent Fredric
  • 56,416
  • 14
  • 107
  • 150
0

There is no global solution. For any given country it is at best rather tricky.

In the UK, the PostOffice controlls postal addresses, and can provide (at a cost) address information for validation purposes.

Government agencies also keep an extensive list of addresses, and these are centrally collated in the NLPG (National Land and Property Gazetteer).

Actually validating against these lists is very difficult. Most people don't even know exactly how their address as it is held by the PostOffice. Some businesses don't even know what number they are on a particular street.

Your best bet is to approach a company that specialises in this kind of thing.

Kramii
  • 8,379
  • 4
  • 32
  • 38
0

NAICS.com is coming out with an API that will add all kinds of key business data including street address. This would happen on the fly as your site's forms are processed. https://www.naics.com/business-intelligence-api/

0

You can try Pitney Bowes “IdentifyAddress” Api available at - https://identify.pitneybowes.com/

The service analyses and compares the input addresses against the known address databases around the world to output a standardized detail. It corrects addresses, adds missing postal information and formats it using the format preferred by the applicable postal authority. I also uses additional address databases so it can provide enhanced detail, including address quality, type of address, transliteration (such as from Chinese Kanji to Latin characters) and whether an address is validated to the premise/house number, street, or city level of reference information.

You will find a lot of samples and sdk available on the site and i found it extremely easy to integrate.

0

You could also try SAP's Data Quality solutions which are available in both a server platform is processing a large number of requests or as an embeddable SDK if you wanted to run it in process with your application. We use it in our application and it's very robust and scalable.

-1

For US addresses you can require a valid state, and verify that the zip is valid. You could even check that the zip code is in the right state, but beyond that I don't think there are many tests you could run that wouldn't provide a lot of false negatives.

What are you trying to do -- prevent simple mistakes or enforcing some kind of identity check?

Rob Walker
  • 46,588
  • 15
  • 99
  • 136
  • You have to use 3rd party, pay-for services. – mmcdole Sep 30 '08 at 15:19
  • You don't have to use 3rd party tools. The USPS has web services that do this. – brian d foy Nov 11 '08 at 05:28
  • Actually if the task you are performing is "make sure this address is acceptable to the local postal services" then in the United States, the USPS could be regarded as the *second* party (but then that makes the client who submitted the address kind of a third party). – tripleee Dec 18 '15 at 05:13