1

Actually I already asked in rapidminer forum, but no one has given an answer yet.. https://community.rapidminer.com/discussion/55963/how-k-nn-algorithms-work-with-same-distance-in-rapidminer#latest

I can't find a satisfying answer for KNN-algorithm with same euclidean distance in rapidminer..

I found a similar question, but it's not in rapidminer K Nearest-Neighbor Algorithm

say k=5. Now I try to classify an unknown object by getting its 5 nearest neighbours. What to do, if distance is a lot of the same distance.. if after determining the 4 nearest neighbors, the next 2 (or more) nearest objects have the same distance and diferent label? Which object of these 2 or more rapidminer chosen as the 5th nearest neighbor?

I confused.. I try in excel, and the result is diferent with rapidminer for some data. in excel the result label is "LU": https://i.ibb.co/RSYnTWg/Capturess.jpg

but the result in rapidminer is "LT" : https://i.ibb.co/NKv0bmp/4.jpg

result rapidminer weighted vote is checked is "LU" : https://i.ibb.co/r68y05v/5.jpg

How rapidminer work with case like that... how rapidminer sorting the distance ?... something wrong with my data ?, or rapidminer sorting random if distance is same ?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194

2 Answers2

0

It is not well definitely what to do in such cases.

Some implementations always return exactly 5 objects (which means there could be multiple different correct answers!) While others then use all tied objects, and yet again others may use all tied objects, but reduce their weight.

You'll need to check the source codes, because I wouldn't be surprised if the manuals are not detailed enough.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

In these cases, where distances are the same, RapidMiner kNN uses the internal sorting of the ExampleSet that was used at training time. So internally it picks the examples that it "saw first".

Try to change the sorting before building the kNN model, it should give different results.

You can verify it with the official source code on github: https://github.com/rapidminer/rapidminer-studio/blob/master/src/main/java/com/rapidminer/operator/learner/lazy/KNNClassificationModel.java

Christian König
  • 3,437
  • 16
  • 28