For a collection of sets, I'll need to group them into multiple "cluster" that all the node in the cluster is a subset of the largest node in that cluster. E.g.
input = [{1, 2}, {1, 2, 3}, {1, 2, 4}, {1, 4}, {1}]
will get:
[{1, 2, 3}, {1, 2}, {1}],
[{1, 2, 4}, {1, 2}, {1, 4}, {1}]
I've tried building a subset tree with reference of this, but it soon become very slow when input is large since it's iterating all children for every insertion.
I'm not familiar with k-mean clustering, but does it apply to the problem here?
What is the most efficient way of doing it?