I have two datasets, let's say checkins and POIs, and I have to join them based on geo-coordinates: let's say, if user was seen in radius of N km near POI, I need to join them (in other words, I want to collect all users near each POI for further manipulations). But I have an issue with this geo matching...
Initially I see two different opportunity: 1) implement LSH (locality sensitive hashing) - looks really complicated and performance might suffer as well 2) split all map in regions (2D-matrix) and then calculated how many regions are within N km distance from checkin or POI - then emit all of them - in result some deduplication must be applied - so, not sure that it is valid algo at all
any best practices?