1

I am developing a PHP application which works with a set of locations provided via Latitude, Longitude-Coordinates in a MySQL-Table. As those locations are imported via external sources, I wanna provide a "merge"-view to the user where they can see groups of locations that are within X meters of each other and merge them into one.

The problem of implementing proximity search has been discussed a lot, for example:

Most of them use the standard haversine formula to calculate distances:

SELECT 
  -- stuff here
  , ( 6371000 * acos( cos( radians($LAT) ) * cos( radians( stuff.lat ) ) * cos( radians( stuff.lng ) - radians($LNG) ) + sin( radians($LAT) ) * sin(radians(stuff.lat)) ) ) AS distance 
FROM 
  stuff
HAVING 
  distance < $distance

However this only works if I have one point to search around, but I wanna find ALL groups of locations, that are within X meters to each other.

My simple solution would be to fetch all locations to PHP-code, then iterate over them and use above query to find all nearby locations for each location. Afterwards I clean duplicate groups. But this solution has a cost of n², because I create n queries that need to calculate the distance to all other locations. The second part might be improved with a bounding box, however I still need to execute n queries (to get all nearby locations for each location).

Is there a more performant method? Maybe even inside one MySQL Query (to just output the ids of the locations per group)?

The table with the locations is just 'id', 'name', 'lat' and 'lng'.

  • Divide the whole area with a squares with the side equal (or slightly more) then the range you are searching for (X meters). Perform this expensive calculations only for those points pairs which squares are adjacent by side or corner. And - if approximate result is enough then simply divide by squares with the `side ~ X * 0.7` and take pairs in adjacent squares. – Akina Jan 22 '20 at 10:19
  • Okay, but how do I get the "point pairs which have adjacent squares"? I have to compute the square geometry for each point, havent I? – ArcticXWolf Jan 22 '20 at 10:40
  • What geometry? Simply divide latitude/longtitude by the distance and round to integer. – Akina Jan 22 '20 at 11:08
  • Ah yeah. But I just encountered an additional problem: What if you have locations A -> B <- C and AB is below X meters, BC is below X meters, but AC isnt. Which groups do I want? Two groups of AB,BC or one with all of them? Maybe I need to rethink the problem itself. – ArcticXWolf Jan 22 '20 at 12:30
  • 1
    *What if you have locations A -> B <- C and AB is below X meters, BC is below X meters, but AC isnt. Which groups do I want? Two groups of AB,BC or one with all of them?* This looks like clusterization, which is more complex task then to find neighboring points for each separate point only. *Maybe I need to rethink the problem itself.* Maybe... moreover, the probability of this is quite high, it seems. – Akina Jan 22 '20 at 12:35
  • Exactly. I think I will just provide a view of nearby locations on the view of each location, thus I can iterate over the set just once. I wont have a merging view which shows all groups, but I believe it will suffice. Thanks for your help! – ArcticXWolf Jan 22 '20 at 13:50
  • @ArcticXWolf - Read this: https://en.wikipedia.org/wiki/Fractal_dimension_on_networks It may give you a clue of the complexity of what you are asking. – Rick James Jan 25 '20 at 20:12

0 Answers0