Semi-supervised clustering/classification

Question

I have data from sensors and I want to run a cluster algorithms on this data. The data contains no information about cluster labels, but I can add some labels manually.

How can I use manually added labels to help unsupervised learning?

One small example - use measurements with labels as initial centers for k-means. What density-based algorithm can I use for this data?

What's the size of your data? How many labels are you prepared to manually label? — user2974951, Dec 07 '18 at 13:21
The size can be 100k-1m rows. About 7 labels and 10 examples for each — cuga, Dec 07 '18 at 13:52
https://stackoverflow.com/questions/21258367/what-are-some-packages-that-implement-semi-supervised-constrained-clustering — hellpanderr, Dec 07 '18 at 16:18
Semi-supervised learning is a good option. The idea being that you manually label some data points, and then use some classification algorithm, such as knn, to get some more labels, for ex. in the case of knn you could label cases which are close to your manual labels. Doing this should give you enough labels that you can perform cluster analysis and label all the remaining cases. — user2974951, Dec 07 '18 at 18:39

score 0 · Answer 1 · answered Dec 07 '18 at 14:31

0

You can choose which samples will be the initial centers for k-means using the init argument (read the doc here).

If an ndarray is passed to init, it should be of shape (n_clusters, n_features) and gives the initial centers. In this case a single initialization will be performed using the centroids specified in the array as explained here.

This shape required means that init must have exactly n_clusters rows, and the number of elements in each row should match the dimensionality of actual_data_points as discussed here.

answered Dec 07 '18 at 14:31

CarlosHPF

46
5

K-means is not a density based algorithm – cuga Dec 07 '18 at 14:34
You can use SSDBSCAN then. http://www.producao.usp.br/handle/BDPI/45673 – CarlosHPF Dec 07 '18 at 16:43

Semi-supervised clustering/classification

1 Answers1