11

I have searched all over for this, but I can't seem to find the best approach to this. I have about 22000 lat/lon points and I want to find the closest one's to the current location of the iPhone. I've seen people ask about Quad Trees, Dijkstra's Algorithm, and spatial databases. Which is the best for the iPhone? Spatial databases seem easiest, but I am not sure.

EDIT: there are actually over 20,000 points. You think iterating through all of them is the way to do it? But thanks for you input.

Thanks.

luvieere
  • 37,065
  • 18
  • 127
  • 179
edrai
  • 111
  • 1
  • 1
  • 5
  • 1
    Premature optimization is the root of all evil. In particular, I don't think you can find a minimum in less than O(n) (you've got to examine each element at least once to check if it's the closest), so I think the only thing you could optimize is the distance calculation (which still should be pretty fast). – las3rjock May 29 '09 at 04:00
  • look at this answer http://stackoverflow.com/a/12997900/779408 – Bob Oct 22 '12 at 12:16

6 Answers6

5

Actually, it is best to use Haversine (great circle) calculation for Lat/Long points, otherwise increasingly large distances will be wrong, especially if you use simple trig like in Jherico's answer.

A quick search provides this javascript example:

var R = 6371; // km Radius of earth
var dLat = (lat2-lat1).toRad();
var dLon = (lon2-lon1).toRad(); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
        Math.cos(lat1.toRad()) * Math.cos(lat2.toRad()) * 
        Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;

In terms of the datastructure, Geohash is worth looking at.

Community
  • 1
  • 1
Alex Taylor
  • 7,128
  • 2
  • 26
  • 22
  • That's nice to know. However, the original question was not about calculation of the distance, but about the data structure to use. Unless the application needs constant recalculatiion in real-time, I agree with Jherico: simple linear search would be sufficient. – Igor Krivokon May 27 '09 at 02:02
  • 2
    @raindog-2: Both Jherico's and Chaos's answers assume linear searches - O(N). It's just that Chaos's answer gives an accurate calculation, whereas Jherico's assumes a flat earth. – Smashery May 27 '09 at 02:06
  • Quite right - I'd use an ordered Set of distances, then the first (or last) point(s) are the ones your interested in. – Alex Taylor May 27 '09 at 02:22
  • I just figured finding the closest point on the earth with ~2k points to choose from meant that the increasing inaccuracy between great circle routes and simple trig wouldn't make a difference – Jherico May 27 '09 at 02:27
  • Well, you still could have very bad choices, even with 2000 points, at the wrap-around point of longitude. Especially near the poles. – Nosredna May 27 '09 at 02:34
  • Nod to Nosredna - I noticed this when making a google maps image splicer.. forced me to learn all about the Great Circle mess at that time. – Alex Taylor May 27 '09 at 02:52
  • Why not use CoreLocation's built in distance calculations? – Rog May 27 '09 at 09:43
5

If you need better than O(N), you can only get that if you first pay N lg N for building a spatial hash of some sort (a quadtree, octree, hash grid, or similar). Then each test will be approximately O(lg N), and can be much better typically by caching the last location you checked, if there's a lot of coherency (generally, there is).

I would probably build an octree in Euler (geocentric, XYZ) space, because that allows me to get "true" distance, not "warped" lat/lon distance. However, in practice, a quad tree in lat/lon space will probably work well enough. Once you have a hit, you hold on to that tree node (assuming the tree isn't re-built at runtime), and the next query starts walking from that tree node, and only needs to worry about nodes that may be closer if the previous point moved further away from the previous answer.

2

As you are on iPhone, you can use CoreLoaction to perform the geographic distance - using CLLocation's – getDistanceFrom:

I would be tempted to use a brute force linear search though all 2k points nad, if that isn't fast enough, switch to something like GeoHash to store meta data against your points for search.

Rog
  • 17,070
  • 9
  • 50
  • 73
1

Why not tile the globe into regions? (Think hexes.) Then, either when you add points to your list, or in one big pre-processing loop, for each point, store the region it is.

Then, when searching for points near point A in hex X, you only need to check points in hex X and a maximum of 3 neighbouring hexes.

If this is still too many points to check, add subregions.

Jane Sales
  • 13,526
  • 3
  • 52
  • 57
0

you must consider that to use Dijkstra you must know your node position in the graph, that is instead the problem you're trying to solve; you're not in the graph, but you want to know the closest point to you

so simply, as already Chaos told you, you must calculate all distances beetween your position and all 20.000 points, then sort them

Noomak
  • 371
  • 5
  • 19
-3

I think this algorithm works:

Create an array sorted by latitude Create an array sorted by longitude

To find the closest, first find the closest by latitude by doing a binary search in the latitude array. Do the same for the longitude array. Now you have 2 points, one closest by latitude, the other closest by longitude. Compute the distances to each point via the pythagorean theorem. The closest point wins.

Roger

  • 3
    That's incorrect. Imagine your reference point is at (0,0). The closest point on y might be at (100,10) and the closest on x at (10,100). You may have a point at (11, 11) which will be closer than these two but the algorithm will miss it – Mihai Damian Jan 26 '10 at 10:24