3

I have a one dimensional List like this

public class Zeit_und_Eigenschaft
{
    [Feature]
    public double Sekunden { get; set; }
}

//...
List<Zeit_und_Eigenschaft> lzue = new List<Zeit_und_Eigenschaft>();
//fill lzue

lzue can be

lzue.Sekunden
1
2
3
4
8
9
10
22
55
...

Goal is to find clusters in that list, ie elements that could form groups like f.i. in this example

lzue.Sekunden
1
2
3
4

8
9
10

22

55

Which clustering algorithm is suitable(I don't know the number of clusters k)? GMM? PCA? Kmeans? Other?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
Gewinn
  • 135
  • 1
  • 11
  • possible duplicate of [partitioning an float array into similar segments (clustering)](http://stackoverflow.com/questions/17479944/partitioning-an-float-array-into-similar-segments-clustering) – Has QUIT--Anony-Mousse Nov 27 '13 at 12:04

2 Answers2

9

Don't look for clustering algorithms.

Clustering is a good term for multivariate data, but your data is one-dimensional, so you should look at much older statistics literature. E.g. Natural Breaks optimization.

Or just kernel density estimation. In fact, you will find the very same question dozens of times here on stackoverflow already...

1D Number Array Clustering

Cluster one-dimensional data optimally?

partitioning an float array into similar segments (clustering)

Efficiently grouping similar numbers together

Clustering values by their proximity in python (machine learning?)

Community
  • 1
  • 1
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

There was a good article in MSDN magazine on this topic a few months ago. They used the k-means algorithm. Link:

http://msdn.microsoft.com/en-us/magazine/jj891054.aspx

Also, there are some videos on k-means clustering as part of Andrew Ng's online machine learning class. Link:

https://class.coursera.org/ml-003/lecture/preview

When you don't know k, there are some algorithms to search for a good value. Do a web search for k-means + elbow.

Brannon
  • 5,324
  • 4
  • 35
  • 83