2

I have a database with a series of entries (no more than say 10s of thousands) where one of the fields is zip code.

I am looking to create a script that when given a zip code and a mile radius (like 50 miles, 100 miles), I can find all the entries that are within that range.

I have seen many articles and scripts on the web that allow one to do this, but they often do this by including a somewhat hefty database of their own.

Is there a way to this using simple calculations where no database is needed?

NOTE: accuracy is not a big deal. If it works for 60 miles when 50 miles is entered, that is perfectly fine.

EDIT

I am noticing a lot of answers that involve getting the lat/lon of the zip code, however, the database only has zip codes stored, so can I assume this includes retroactively getting/storing the lat/lon of each entry into the database as well?

steveo225
  • 11,394
  • 16
  • 62
  • 114

6 Answers6

2

If accuracy is not an issue, you can plot the longitude and latitude of the center of the zip code, or a rough estimate of the center. Then, you can have a script calculate the distance between each of the centers, using the formulas listed here . There is NO need for the use of the haversine formula as illustrated here, that is for extreme accuracy, which you do not need. The trigonometric and inverse trigonometric functions drain much computational resources, and is a waste. Querying the distance every time and making MySQL run through such a complex formula for every zip code is just a waste of resources.

If you must do this many times every day, it might be faster (space vs speed tradeoff here) to just store the distance between every zip code. The total number of entries in such a storage table is the square of the number of zip codes you have. Running through tens of thousands of zip codes each time to calculate that consumes vast amounts of computational power, not to mention be slower for the end user.

Also, a similar question has been asked on here before, please look at the accepted answer on this question, it applies to your situation: Calculate distance between zip codes and users

Community
  • 1
  • 1
LostInTheCode
  • 1,724
  • 2
  • 15
  • 22
  • +1 for "just calculate it once". Also, Haversine would be required for large distances; but for 100-mile radius, the difference between a curved and flat surface would be indeed insignificant. – Piskvor left the building Jun 08 '11 at 13:08
  • +1 I agree. Precalculating as much as possible will definitely help performance. I think I am going to use a hybrid approach based on the data provided by Piskvor, but this will speed up the hot zone calculations. Thanks – steveo225 Jun 08 '11 at 13:32
2

No. ZIP codes do follow certain logic at a high level, but there is no algorithmic mapping between ZIP code and location (i.e., zip codes are assigned to a location, not computed for a location - this process is overseen by the postal authority (USPS in the USA), but it will always take non-geographical factors into account, e.g. population density; hence the impossibility of computing ZIP code from a location or vice versa).

So, you will need that "hefty database" to tell you which ZIP code belongs to what location; once you have the location, you can compute which entries are within the radius, as usual.

(You could query a webservice, but that just hides the database in the cloud - at some point, someone has to check where 90210 lies physically.)

Piskvor left the building
  • 91,498
  • 46
  • 177
  • 222
  • The certain high level logic is actually what I was hoping to take advantage of. Again, accuracy is not a really big deal, so long as it is reasonable. – steveo225 Jun 08 '11 at 12:35
  • @steveo225: See the map at that Wikipedia link, then. I don't think you can get a significantly more accurate result from this than "this ZIP code is somewhere in California", though. – Piskvor left the building Jun 08 '11 at 13:05
  • I saw that, I was hoping there was some logic related to North, East, South, West, but it doesn't appear so. Thanks USPS – steveo225 Jun 08 '11 at 13:15
  • 1
    @steveo225: Well, that's what we get for people refusing to settle the land uniformly, clustering close to resources instead and making those pesky cities ;) – Piskvor left the building Jun 08 '11 at 13:19
  • I think using state approach will actually work well. 90% of the time, all the zip codes are within 200 miles of each other. So, perhaps a lat/lon database just over that area coupled with the data in the map on wikipedia. If it isn't in the hot zone, then we can group by state. – steveo225 Jun 08 '11 at 13:30
2

Another option that may or may not be suitable in this situation is to use counties when looking for a radius. For most of the projects I have worked on, the user is aware of their county and is familiar with it. You can buy a county to zip database (and city/state/etc) for $20-$40. It is something that can be reused over and over again in many different projects. It doesn't necessarily give you the exact ability to do a 10/25/50/100 mile radius, but may allow you to perform an in my region style of radius lookup based upon the requested zipcode.

Adam MacDonald
  • 1,958
  • 15
  • 19
0

You could use google maps to get latitude & longitude for this zip codes, query maybe like this "zip, "

And that you could read this http://www.movable-type.co.uk/scripts/latlong.html to see how to calculate distance between to points

Here is how to do this using SQL http://www.movable-type.co.uk/scripts/latlong-db.html, first it calculate range, and that search by this range (it will work faster)

azat
  • 3,545
  • 1
  • 28
  • 30
0

One option would be to use Google maps api to find the lat and long of your location. You would then have to have a lat and long in your database for each of your zip codes and you write a script to calculate from there.

eagle12
  • 1,658
  • 11
  • 14
-1

The main thing you need is to translate those zip codes into geocoordinates (i.e. latitude and longitude couples). Once you have that, you can look at this presentation which explains in detail how to search by distance given coordinates.

Seldaek
  • 40,986
  • 9
  • 97
  • 77