1

how can I calculate Jaccard similarity for more than 2 individuals? for example we can calculate Jaccard similarity for A1 and A2 in the below way, however if we have say 1000 individuals, how do we loop?

def jaccard_similarity(A1, A2):
    s1 = set(A1)
    s2 = set(A2)
    return float(len(s1.intersection(s2)) / len(s1.union(s2)))
A1= ['dog', 'cat', 'cat', 'rat']
A2= ['dog', 'cat', 'mouse']
jaccard_similarity(A1, A2)

Thanks!

  • Jaccard similarity is pairwise, so you can compute it for `A1 A2`, `A1 A3`, `A2 A3` and so on. You can try to compute the intersection of all the sets divided by the union of all the sets, but I have no idea whether the result is meaningful. [Here](https://stackoverflow.com/a/58696146/2237151) there is a way to compute the pairwise Jaccard similarity. – Pietro Sep 13 '21 at 13:44

0 Answers0