1

I've been tasked with writing a K-Nearest Neighbor algorithm without using the built in MATLAB command, knnsearch().

I'm following the code provided here: http://www.cs.tut.fi/sgn/arg/SGN-2806/exer1/knn.m

To work out the distances, I'm using absolute differences.

My training data will look like this:

training data = [0 1 3 4 5 0 3 4 9 1 35 13 29 1 3]

Sample data to calculate knn, this:

inputdata = [1, 4, 5, 5, 2, 1 3]

This is what I am using to calculate absolute differences:

distancesbetweendata = abs(bsxfun(@minus, inputdata(1:N)), trainingdata(N-1));

This returns 0/empty

Is this correct?

In the code I'm following, the data is prepared such as:

N = size(sampleData,1);  % N is length of data
M = size(sampleData,2);  % M is index for class

However it uses data with multiple rows, whereas I am just using one.

How should I prepare my data in this instance?

Edit:

How to calculate output:

[sorted, idx] = sort(distance);

classes = sampleData(idx(1:min(N, k)));

resultClass = mode(classes); %OUTPUT
user3089
  • 51
  • 4
  • Please add your output as well.. – kkuilla Feb 16 '15 at 16:00
  • No. Add the output of `distancesbetweendata`. You are asking if that is correct. You also have to make it clear what the trainingdata and the input data is. Add something like `inputdata = ...` etc to make it clear. – kkuilla Feb 16 '15 at 16:13
  • distancesbetweendata returns 0, initial data and training data is above, but I'll label it better – user3089 Feb 16 '15 at 16:26
  • @user3089 - Your question has already been answered. The duplicate post that I referred you to shows a detailed implementation of KNN without the use of `knnsearch`. – rayryeng Feb 16 '15 at 18:28
  • That is for finding the euclidian distance, I've altered the title – user3089 Feb 16 '15 at 18:46
  • @user3089 - I recently edited my post to accommodate for the Manhattan / absolute distance. You should check out the post again, and I'm still closing as a duplicate. – rayryeng Feb 16 '15 at 19:12
  • @user3089 - Even though I'm leaving as a duplicate, what's wrong with your implementation is that you are forgetting to sum all of the absolute distances together. – rayryeng Feb 16 '15 at 19:13
  • I tried implementing it, but the manhattan distances cause errors, 'Matrix dimensions must agree.' I don't have enough reputation to comment on the original post – user3089 Feb 16 '15 at 19:57
  • @rayryeng dists = sum(abs(bsxfun(@minus, x, sample)), 2); -- this gives me: Non-singleton dimensions of the two input arrays must match each other. dists = sum(abs(bsxfun(@minus, x, sample)), 2); this gives me -- Non-singleton dimensions of the two input arrays must match each other. – user3089 Feb 16 '15 at 20:38
  • You need to read how the inputs are shaped. `x` is a `M x N` array where `M` is the number of samples and `N` is the dimensionality of the sample. `sample` is a `1 x N` input. Make sure your data conforms to this shaping before running the code. This is also the same way that `knnsearch` accepts inputs. This is where I will stop replying as you really need to read the post carefully. – rayryeng Feb 16 '15 at 20:42

0 Answers0