What clustering algorithm to use on 1-d data?

Question

I have a list of numbers in an array. The index of each element is X and the value is Y. How do i go about partitioning/clustering this data? If i had an array, i just want a set of values which mark the end of each partition. Since I'm working on Python, please do mention if there are libraries to do the same.

Thanks.

What's the data? What's your application? Are you sure you want clustering rather than segmenting? i.e. Do you want all points in a cluster to be contiguous X samples? This is what you'd usually do for a time series. — dimatura, May 27 '11 at 06:53
possible duplicate of [not random clusters in 1D data set](http://stackoverflow.com/questions/5738490/not-random-clusters-in-1d-data-set) — Has QUIT--Anony-Mousse, Feb 01 '13 at 07:42

score 5 · Accepted Answer · answered May 27 '11 at 03:19

5

K-Means is a very simple clustering algorithm, I would say the first to test before going for more complex things. The K-Means algorithm http://en.wikipedia.org/wiki/K-means_clustering

Proper K-Means initialization is strongly advised http://en.wikipedia.org/wiki/K-means%2B%2B, as it.

If you're not happy with K-Means, then you use EM algorithm with Gaussian mix ( http://en.wikipedia.org/wiki/Mixture_model ), not too hard to code and you can use K-Means to initialize it !

Those have been implemented 100 times in Python, check any machine learning toolbox.

answered May 27 '11 at 03:19

Monkey

1,838
1
17
24

5

SciPy has a very friendly implementation of kmeans in its cluster package. I was just using it today as a matter of fact, and I happen to have the docs in another tab right now: http://docs.scipy.org/doc/scipy/reference/cluster.vq.html – jscs May 27 '11 at 03:27
2

**Don't use k-means on 1-d data. Use optmized 1-d techniques.** – Has QUIT--Anony-Mousse Feb 01 '13 at 07:41

What clustering algorithm to use on 1-d data?

1 Answers1

Linked