2d Tree Nearest Neighbor Algorithm Clarification

Question

I am trying to implement a recursive nearest neighbour algorithm for a 2d-Tree. Recursion (and unwinding recursion) is still kind of confusing for me and the best pseudocode I have found is from this StackOverflow question:

2D KD Tree and Nearest Neighbour Search

However the answer uses a "Median" value, which I am not sure how to compute. Also the wikipedia article on kd-trees has a nearest neighbour pseudo code that does not use a median value.

I would like to know if it is possible to construct a recursive version of the Nearest Neighbours algorithm without using a median value. If anyone can provide me with pseudo code for this I will be grateful.

Not sure if this will fully clarify things for you, but you could have a look at my recent [answer](http://stackoverflow.com/a/29868500/2573395) on this [question](http://stackoverflow.com/questions/29853162/improving-mitchells-best-candidate-algorithm), that provides a minimal implementation of a 2D KD Tree. — Alex, Apr 26 '15 at 20:53
The syntax is unfamiliar to me as I use mostly Java. But I will go through your answer tomorrow and see if it helps, thanks. — codefi, Apr 26 '15 at 21:13
The linked article's terminology is a bit off. What the OP calls the median is just the splitting value. When possible, it's good to use the data median for splitting because it gives the tree of minimum height, gut this is not a requirement. E.g. if you don't have access to all the data in advance, you can't find the median. All kd-tree construction and search relies on _splitting_ values whether they're medians or not. — Gene, Apr 26 '15 at 21:19
I believe the median is calculated by maintaining a list of the nodes in the current subtree and choosing the middle value. I was wondering if there is a version that you don't need to use a median at all and therefore not maintain a list of the nodes. Even better if there is an example pseudocode algorithm that you can link I appreciate it. I am struggling to find a good example. Thanks. — codefi, Apr 28 '15 at 12:17
@Adamme I know got an upvote for my answer. Thanks, however, it would be nice if you accepted the answer too, so that it appears as answered in the question feed. :) — gsamaras, May 09 '16 at 06:50

score 1 · Answer 1 · edited May 23 '17 at 12:23

If you are desperate in not using a median, you can use mean. Here, there is the simple approach:

Example 1: What is the Mean of these numbers?

6, 11, 7
Add the numbers: 6 + 11 + 7 = 24
Divide by how many numbers (there are 3 numbers): 24 / 3 = 8
The Mean is 8

However, I highly recommend you to go for the median, since the dimensions allow it in your case.

Example: find the Median of 12, 3 and 5

Put them in order:

3, 5, 12

The middle number is 5, so the median is 5.

Source

You do not really need to sort them. Pseudo-sorting is enough, be using Quickselect for example.

In C++ for example you could use use nth_element() to efficiently find the median. You can see my question here, where I needed the median for general dimensions. In the case of 2D, it can sure be simplified.

2d Tree Nearest Neighbor Algorithm Clarification

1 Answers1