0

I want to use Python to calculate Jaccard similarity between different clusters. In my dataset there are clusters that are labeled already. The result must be an adjacency matrix according to cluster. I have seen the question about the Jaccard similarity between the different rows, but I still don`t know which value should I calculate for each cluster and how can I deal with.

Here is my datasets below: There are only first 5 rows, in fact there are more as 3000 rows with 12 clusters

mkpppp
  • 1
  • 1
  • Do you have some labelled examples? i.e. Which cluster is correct for which input? – Nathan McCoy Jan 23 '20 at 23:04
  • 1
    Please provide your sample data as text in the body of your question, not as a picture – G. Anderson Jan 23 '20 at 23:24
  • One option is to [find the cluster centroids](https://stackoverflow.com/questions/50332786/how-to-find-cluster-centroid-with-scikit-learn) and their values, then calulate the distance between centroids (since centroid is meant to be the mathematical center of the cluster) – G. Anderson Jan 23 '20 at 23:36

0 Answers0