0

Does there exist nearest neighbor data structure that supports delete and add operations along with exact nearest neighbor queries? Looking for a Python implementation ideally.

Attempts:

  • Found MANY implementations for approximate nearest neighbor queries in high dimensional spaces.
  • Found KD Trees and Ball Trees but they do not allow for dynamic rebalancing.
  • Thinking an algorithm could be possible with locality sensitive hashing.
  • Looking at OctTrees.

Context:

  1. For each point of 10,000 points, query for it's nearest neighbor
  2. Evaluate each pair of neighbors
  3. Pick one and delete the pair of points and add a merged point.
  4. Repeat for some number of iterations

1 Answers1

0

Yes. There exists such a datastructure. I invented one. I had exactly this problem at hand. The datastructure makes KD-trees seem excessively complex. It consists of only sorted lists of points in each dimensionality the points have.

Obviously you can add and removing a n-dimensional point from n lists sorted by their respective dimensions rather trivially without issue. A lot of the tricks allows one to iterate these lists and mathematically prove you have the shortest distance to a point. See my answer here for elaboration and code.

I must note though that your context is wrong. The closest point for A may be B, but it doesn't hold that B's closest point is necessarily A. You could rig a chain of points such that each distance between each link is less than the one before but also necessarily further than the other points resulting in there being only 1 pair of neighbors that share their nearest neighbor.

Tatarize
  • 10,238
  • 4
  • 58
  • 64