4

I am trying to get KNN with cosine distance but it looks like the metric parameter does not take cosine distance. Only the below metrics are available in http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html . Why is that ?

Metrics intended for real-valued vector spaces: identifier class name args distance function “euclidean” EuclideanDistance
sqrt(sum((x - y)^2)) “manhattan” ManhattanDistance
sum(|x - y|) “chebyshev” ChebyshevDistance
sum(max(|x - y|)) “minkowski” MinkowskiDistance p sum(|x - y|^p)^(1/p) “wminkowski” WMinkowskiDistance p, w sum(w * |x - y|^p)^(1/p) “seuclidean” SEuclideanDistance V sqrt(sum((x - y)^2 / V)) “mahalanobis” MahalanobisDistance V or VI sqrt((x - y)' V^-1 (x - y)) Metrics intended for two-dimensional vector spaces: identifier class name distance function “haversine” HaversineDistance
2 arcsin(sqrt(sin^2(0.5*dx) cos(x1)cos(x2)sin^2(0.5*dy)))

SriK
  • 1,011
  • 1
  • 15
  • 29
  • Possible duplicate of [Using cosine distance with scikit learn KNeighborsClassifier](http://stackoverflow.com/questions/34144632/using-cosine-distance-with-scikit-learn-kneighborsclassifier) – ayhan Jul 31 '16 at 21:07

1 Answers1

1

Cosine distance isnt a proper distance in the sense that it doesnt satisfy the triangle inequality. Its an angle and doesnt represent a shortest distance in any sense per se. This is described well here - https://en.wikipedia.org/wiki/Cosine_similarity . For K-Means or any distance type similarity algorithm, satisfying the distance metric requirements (https://en.wikipedia.org/wiki/Metric_(mathematics)) is a necessary requirement.

SriK
  • 1,011
  • 1
  • 15
  • 29