3

Is there an easier/faster way to find out if two dictionaries are disjoint than calculating their intersection?

For the intersection I found this answer, so the disjoint-test would look like this:

def dicts_disjoint(a, b):
    keys_a = set(a.keys())
    keys_b = set(b.keys())
    intersection = keys_a & keys_b
    return not len(intersection)

However I think this is inefficient since it always calculates the whole intersection (no short-circuit quit).

Any better ideas?

Community
  • 1
  • 1
Johannes
  • 3,300
  • 2
  • 20
  • 35

4 Answers4

9

Don't convert to a set; a dict_keys object already supports isdisjoint.

d1.keys().isdisjoint(d2)
Veedrac
  • 58,273
  • 15
  • 112
  • 169
4

Are you looking for something like:

def dicts_disjoint(a, b):
    return not any(k in b for k in a)

Or:

def dicts_disjoint(a, b):
    return all(k not in b for k in a)

Both will short-circuit.

AChampion
  • 29,683
  • 4
  • 59
  • 75
  • it might be easier to read as `all(k not in b for k in a)`. I always have to read the `not any`s twice xD – Ma0 Jul 05 '17 at 14:50
  • Are you sure this is **faster** (as OP wants) than set intersection? – randomir Jul 05 '17 at 14:52
  • @randomir Due to short-circuiting and the arbitrary nature of dicts it is very hard to say. This is a problem of statistics. For what it's worth, I believe it is faster. – Ma0 Jul 05 '17 at 14:56
  • This will depend on the size of the dict, it will be faster for smaller dicts. Both are `O(n)` but the set intersection will have the advantage of `C` code for larger dicts to overcome the `set()` construction, providing most are disjoint. – AChampion Jul 05 '17 at 14:57
  • this is neat, I like it – Johannes Jul 05 '17 at 14:59
  • And I suppose if you have a very large dict and a very small one it is a lot faster if `a` is the smaller one. Correct? (Assuming dict lookup is constant time) – Johannes Jul 05 '17 at 15:03
  • Yes, `dict` lookup is `O(1)`. – AChampion Jul 05 '17 at 15:04
4

Edited to display methods and timings only

Since the OP asked about the fastest method to perform this operation, I've ranked the ones under discussion according to a (I hope) fair test on my machine. The aim is to find if the keys of a dictionary are disjoint, and it seems the dict_keys.isdisjoint() method wins out over other set or list operations.

However, as mentioned in other answers, this will vary considerably depending on the size of the relative dictionaries and whether or not they are disjoint.

These tests are only for two disjoint dictionaries of equal (small) size.

Fastest: dict_keys.isdisjoint()

Example:

{"a": 1, "b": 2, "c": 3 }.keys().isdisjoint({ "d": 4, "e": 5, "f": 6}.keys())

Timing:

>>> timeit.timeit('{"a": 1, "b": 2, "c": 3 }.keys().isdisjoint({ "d": 4, "e": 5, "f": 6}.keys())')
0.4637166199972853

Second Fastest: set.isdisjoint()

Example:

set({"a": 1, "b": 2, "c": 3 }.keys()).isdisjoint(set({ "d": 4, "e": 5, "f": 6}.keys()))

Timing:

>>> timeit.timeit('set({"a": 1, "b": 2, "c": 3 }.keys()).isdisjoint(set({ "d": 4, "e": 5, "f": 6}.keys()))')
0.774243315012427

Third Fastest: List Comp and all():

Example:

all(k not in {"a": 1, "b": 2, "c": 3 } for k in { "d": 4, "e": 5, "f": 6})

Timing:

>>> timeit.timeit('all(k not in {"a": 1, "b": 2, "c": 3 } for k in { "d": 4, "e": 5, "f": 6})')
0.8577601349970791

Fourth Fastest: Symmetric Difference (^) with not()

Example:

not set({"a": 1, "b": 2, "c": 3 }.keys()) ^ set({ "d": 4, "e": 5, "f": 6}.keys())

Timing:

>>> timeit.timeit('not set({"a": 1, "b": 2, "c": 3 }.keys()) ^ set({ "d": 4, "e": 5, "f": 6}.keys())')
0.9617313010094222
Dom Weldon
  • 1,728
  • 1
  • 12
  • 24
  • This is - of course, if you're only looking to see if no keys are shared. – Dom Weldon Jul 05 '17 at 14:50
  • 1
    `not bool(x)` is just `not x`. Converting with `set` is a waste of time and won't short-circuit. `^` won't short-circuit. – Veedrac Jul 05 '17 at 15:04
  • Good point @Veedrac - although I sometimes write `bool()` to aid readability, removing it sped up the script considerably. Not sure what you mean about short-circuiting `^` though? This operation only works on sets to my knowledge... – Dom Weldon Jul 05 '17 at 15:08
  • 1
    Your third test is not a valid comparison. You would use dicts instead of lists, and `k not in list` runs in linear time while `k not in dict` takes constant. So its highly dependent on the input size. – Johannes Jul 05 '17 at 15:10
  • Thanks - I've edited it to reflect a fair timing for all methods. – Dom Weldon Jul 05 '17 at 15:20
1

Using only bit operation: variables

dict1 = {"a":1, "b":2}
dict2 = {"a":4, "c":2}

union

dict1.keys() | dict2.keys()
# {'b', 'a', 'c'}

intersection

dict1.keys() & dict2.keys()
# {'a'}

disjoint

dict1.keys() ^ dict2.keys()
# {'b', 'c'}

Then set a condition:

"different" if dict1.keys() ^ dict2.keys() else "equal"

An empty set() return "False" in boolean conditions

Franz Kurt
  • 1,020
  • 2
  • 14
  • 14