6

Which is more performant, and what is asymptotic complexity (or are they equivalent) in Python?

set.add(12)

if 12 not in somelist:
    somelist.append(12)
GL2014
  • 6,016
  • 4
  • 15
  • 22
  • 4
    https://wiki.python.org/moin/TimeComplexity. `item in list` is `O(n)`, so `set.add` wins. – jonrsharpe Jan 04 '17 at 14:21
  • That was simpler than I expected to answer, care to provide some implementation details in an answer, and I'll accept it? It says set is similar to dict, are they both hash maps in Cpython? – GL2014 Jan 04 '17 at 14:26
  • sets: http://stackoverflow.com/questions/3949310/how-is-set-implemented lists: http://stackoverflow.com/questions/3917574/how-is-pythons-list-implemented – Patrick Haugh Jan 04 '17 at 14:30

1 Answers1

14

The set is far faster, in general. Testing for membership in a list is O(n), linear in the size of the list. Adding to a set is O(1), independent of the number of the items in the list. Aside from that, the list code makes two function calls: one to check if 12 is in the list, another to add it, while the set operation makes just one call.

Note that the list solution can be fast, though, when the item doesn't need to be added to the list because it was found early in the list.

# Add item to set
$ python -m timeit -s 's = set(range(100))' 's.add(101)'
10000000 loops, best of 3: 0.0619 usec per loop

# Add item not found in list
$ python -m timeit -s 'l = list(range(100))' 'if 101 not in l: l.append(101)'
1000000 loops, best of 3: 1.23 usec per loop

# "Add" item found early in list
$ python -m timeit -s 'l = list(range(100))' 'if 0 not in l: l.append(0)'
10000000 loops, best of 3: 0.0214 usec per loop

# "Add" item found at the end of the list
$ python -m timeit -s 'l = list(range(102))' 'if 101 not in l: l.append(101)'
1000000 loops, best of 3: 1.24 usec per loop
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Can you please elaborate on the "found early in list" scenario. Since a list can contain duplicate elements, if an item is found early in the list why does it not have to be added? – Saurabh Oct 14 '20 at 19:16
  • 1
    If the item is found *anywhere* in the list , `l.append` won't be called. The question is, how long does it take to find the item. With a `set`, it takes the same amount of time to find an item, no matter where in the set it is found. With a `list`, the time it takes is proportional to its index: the earlier in the list an item occurs, the fast it can be found. – chepner Oct 14 '20 at 20:40