1

I want to use k nearest neighbor for multi label classification. there are some classifiers based on knn which are implemented in mulan library, or are written in C or Matlab such as MLKNN.

when I use the same classifier for numeric dataset I get identical result, but for nominal dataset such as slashdot and genbase (it is noticeable that the data are only 0 and 1) I obtain different result.

I want to know why this happen? these classifiers use euclidean distance and Mulan use euclidean distance of Weka .

why the result of the lazy classifiers in mulan for nominal data is different from those which are written in other languages? which one is correct? I will be happy if you help me to find the reason.

niloofar
  • 11
  • 3
  • Note that MULAN uses WEKA's `EuclideanDistance` has the attribute `dontNormalize` set to false. Which means that the data is by default normalised. Check if this is the cause you are getting the discrepancy. Also check this quesion I asked before: https://stackoverflow.com/questions/41680764/a-discrepancy-in-computing-nearest-neighbours-between-r-and-java-weka – phoxis May 10 '18 at 14:37

0 Answers0