6

What is the most efficient way to determine whether a collection of sets is pairwise disjoint? -- i.e. verifying that the intersection between all pairs of sets is empty. How efficiently can this be done?

Robert T. McGibbon
  • 5,075
  • 3
  • 37
  • 45
  • 1
    I am wary about your question. How efficiently it can be done depends on many factors, like the size of the sets, the number of the sets, the overlap of the sets with each other and many more. – Niklas B. Mar 16 '14 at 05:07
  • 1
    You can also probably maintain your sets differently so that this operations becomes much more efficient than the O(n) lower bound that you otherwise have because all the elements need to be looked at. – Niklas B. Mar 16 '14 at 05:10
  • While we're at it: How *are* you representing your sets? And what are the operations you perform on them? – Niklas B. Mar 16 '14 at 05:13
  • For (only) 2 lists there's [Test if lists share any items in python - Stack Overflow](https://stackoverflow.com/questions/3170055/test-if-lists-share-any-items-in-python) – user202729 Dec 05 '21 at 17:46

3 Answers3

7

The sets from a collection are pairwise disjoint if, and only if, the size of their union equals the sum of their sizes (this statement applies to finite sets):

def pairwise_disjoint(sets) -> bool:
    union = set().union(*sets)
    return len(union) == sum(map(len, sets))

This could be a one-liner, but readability counts.

0 _
  • 10,524
  • 11
  • 77
  • 109
6

Expected linear time O(total number of elements):

def all_disjoint(sets):
    union = set()
    for s in sets:
        for x in s:
            if x in union:
                return False
            union.add(x)
    return True

This is optimal under the assumption that your input is a collection of sets represented as some kind of unordered data structure (hash table?), because than you need to look at every element at least once.

You can do much better by using a different representation for your sets. For example, by maintaining a global hash table that stores for each element the number of sets it is stored in, you can do all the set operations optimally and also check for disjointness in O(1).

0 _
  • 10,524
  • 11
  • 77
  • 109
Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • 2
    this is overloading the built-in function `all`, right? might as well use a different variable name – tscizzle Oct 07 '15 at 19:39
1

Using Python as psudo-code. The following tests for the intersection of each pair of sets only once.

def all_disjoint(sets):
    S = list(sets)
    while S:
        s = S.pop()  # remove an element
        # loop over the remaining ones
        for t in S:
            # test for intersection
            if not s.isdisjoint(t):
               return False
    return True

The number of intersection tests is the same as the number of edges in a fully connected graph with the same number of vertexes as there are sets. It also exits early if any pair is found not to be disjoint.

0 _
  • 10,524
  • 11
  • 77
  • 109
Dan D.
  • 73,243
  • 15
  • 104
  • 123