So I have a dataset consisting 130000 points, in the format (x,y). My final goal is to cluster this data using kmeans. But for applying that, I need to find the optimum number of clusters to pass to the kmeans algorithm. How should I apply something like Gap Statistics or Levene's test in python to achieve this?
Asked
Active
Viewed 84 times
2
-
check [this](https://gist.github.com/michiexile/5635273) example using scipy. – Burak Nov 19 '15 at 20:06