0

I have set of points, and i want clusters out of them. I know how to do normal k-means algorithm. But i don't want to take 'k' as input. Suppose if i have points like 1,3,4,50,60,70,1000,10002,10004 the algorithm should cluster them into 3 clusters C1: 1,3,4 C2: 50,60,70 C3: 1000,1002,1004 satisfying distance between intracluster elements should be minimum, and distance between intercluster should be maximum.

denis
  • 21,378
  • 10
  • 65
  • 88
Navin
  • 411
  • 3
  • 9
  • 17
  • Why did you use the word random? – Gumbo May 09 '11 at 07:51
  • @Gumbo: Since i don't want to take k as input, i simply called as random clustering. Is that lead to something else? – Navin May 09 '11 at 08:03
  • maybe this helps... http://www.slideshare.net/pierluca.lanzi/machine-learning-and-data-mining-08-clustering-hierarchical – mkn May 09 '11 at 12:13
  • possible duplicate of [How do I determine k when using k-means clustering?](http://stackoverflow.com/questions/1793532/how-do-i-determine-k-when-using-k-means-clustering) – Gilles 'SO- stop being evil' Jun 06 '11 at 10:08

2 Answers2

0

See how-do-i-determine-k-when-using-k-means-clustering and the links there.

Community
  • 1
  • 1
denis
  • 21,378
  • 10
  • 65
  • 88
0

Deciding on k is a problem which repeats itself with many clustering algorithms. You might want to consider spectral clustering (and its various algorithmic cousins) which manages to some alleviate that problem. However, many versions use k-means as the final step, returning you to square one (although not all).

Alternatively, there are many approaches for finding the optimal value of k, such as the answer supplied by Denis above; this might be enough for your purposes.

Elli Amir
  • 365
  • 1
  • 3
  • 9