2

I'm looking for a method to perform density based clustering. The resulting clusters should have a representative unlike DBSCAN. Mean-Shift seems to fit those needs but doesn't scale enough for my needs. I have looked into some subspace clustering algorithms and only found CLIQUE using representatives, but this part is not implemented in Elki.

Milan
  • 929
  • 2
  • 13
  • 25

2 Answers2

2

As I noted in the comments on the previous iteration of your question, https://stackoverflow.com/questions/34720959/dbscan-java-library-with-corepoints

Density-based clustering does not assume there is a center or representative.

Consider the following example image from Wikipedia user Chire (BY-CC-SA 3.0):

enter image description here

Which object should be the representative of the red cluster?

Density-based clustering is about finding "arbitrarily shaped" clusters. These do not have a meaningful single representative object. They are not meant to "compress" your data - this is not a vector quantization method, but structure discovery. But it is the nature of such complex structure that it cannot be reduced to a single representative. The proper representation of such a cluster is the set of all points in the cluster. For geometric understanding in 2D, you can also compute convex hulls, for example, to get an area as in that picture.

Choosing representative objects is a different task. This is not needed for discovering this kind of structure, and thus these algorithms do not compute representative objects - it would waste CPU.

Community
  • 1
  • 1
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
2

You could choose the object with the highest density as representative of the cluster.

It is a fairly easy modification to DBSCAN to store the neighbor count of every object.

But as Anony-Mousse mentioned, the object may nevertheless be a rather bad choice. Density-based clustering is not designed to yield representative objects.

You could try AffinityPropagation, but it will also not scale very well.

Erich Schubert
  • 8,575
  • 2
  • 26
  • 42