I have a fairly large set of 2D points (~20000) in a set, and for each point in the x-y plane want to determine which point from the set is closest. (Actually, the points are of different types, and I just want to know which type is closest. And the x-y plane is a bitmap, say 640x480.)
From this answer to the question "All k nearest neighbors in 2D, C++" I got the idea to make a grid. I created n*m C++ vectors and put the points in the vector, depending on which bin it falls into. The idea is that you only have to check the distance of the points in the bin, instead of all points. If there is no point in the bin, you continue with the adjacent bins in a spiralling manner.
Unfortunately, I only read Oli Charlesworth's comment afterwards:
Not just adjacent, unfortunately (consider that points in the cell two to the east may be closer than points in the cell directly north-east, for instance; this problem gets much worse in higher dimensions). Also, what if the neighbouring cells happen to have less than 10 points in them? In practice, you will need to "spiral out".
Fortunately, I already had the spiraling code figured out (a nice C++ version here, and there are other versions in the same question). But I'm still left with the problem:
If I find a hit in a cell, there could be a closer hit in an adjacent cell (yellow is my probe, red is the wrong choice, green the actual closest point):
If I find a hit in an adjacent cell, there could be a hit in a cell 2 steps away, as Oli Charlesworth remarked:
But even worse, if I find a hit in a cell two steps away, there could still be a closer hit in a hit three steps away! That means I'd have to consider all cells with dx,dy= -3...3, or 49 cells!
Now, in practice this won't happen often, because I can choose my bin size so the cells are filled enough. Still, I'd like to have a correct result, without iterating over all points.
So how do I find out when to stop "spiralling" or searching? I heard there is an approach with multiple overlapping grids, but I didn't quite understand it. Is it possible to salvage this grid technique?