How to generate clusters in k-means algorithm without giving the k value. I want do k-means clustering and generate clusters automatically.
-
see this https://datasciencelab.wordpress.com/2013/12/27/finding-the-k-in-k-means-clustering/ – ZenithS Nov 04 '15 at 09:48
-
Thanks ZenithS, the link you suggestted is really helpful – Asif Nov 04 '15 at 09:56
-
A simple google search resulted in https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set – Ramón J Romero y Vigil Nov 04 '15 at 10:20
-
Possible duplicate of [How do I determine k when using k-means clustering?](http://stackoverflow.com/questions/1793532/how-do-i-determine-k-when-using-k-means-clustering) – Has QUIT--Anony-Mousse Nov 04 '15 at 22:17
2 Answers
You may try mean shift clustering, it behaves like k-means clustering and does not have a k parameter.
The basic idea is as follows: clustering is like increasing the "high frequencies" in your dataset, or "sharpening" your dataset, in order to find the "modes" (the "modes" correspond to the significant "trends" in your dataset). The inverse operation, i.e. smoothing the dataset, is easier to define (in short, replace each sample with the mean of its neighbors). Thus, from this definition, you can extract the "high frequency" component of the signal, as the difference between the initial signal and the smoothed one. This gives you a "gradient direction", or a "good move" that will sharpen the signal. In the end of the process, all the samples will be clustered in a small number of points, corresponding to the "modes".
Reference: https://en.wikipedia.org/wiki/Mean_shift

- 2,495
- 17
- 30
there is X-means (K-means variation), it is implemented in Weka. For more info see documentation:

- 2,414
- 1
- 21
- 39