21

The official Python 2.7 docs for these methods sounds nearly identical, with the sole difference seeming to be that remove() raises a KeyError while discard() does not.

I'm wondering if there is a difference in execution speed between these two methods. Failing that, is there any meaningful difference (barring KeyError) between them?

cottontail
  • 10,268
  • 18
  • 50
  • 51
Akshat Mahajan
  • 9,543
  • 4
  • 35
  • 44
  • Related post on similar lines for list data structure - [Difference between del, remove and pop on lists](https://stackoverflow.com/q/11520492/465053) – RBT Aug 01 '18 at 02:52

2 Answers2

39

Raising an exception in one case is a pretty meaningful difference. If trying to remove an element from a set that is not there would be an error, you better use set.remove() rather than set.discard().

The two methods are identical in implementation, except that compared to set_discard() the set_remove() function adds the lines:

if (rv == DISCARD_NOTFOUND) {
    set_key_error(key);
    return NULL;
}

This raises the KeyError. As this is slightly more work, set.remove() is a teeniest fraction slower; your CPU has to do one extra test before returning. But if your algorithm depends on the exception then the extra branching test is hardly going to matter.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
1

I was curious about this as well and did a timeit test and it seems there's no difference at all (see the results here). Essentially, the best timings of 5 runs each of discard and remove were compared. This was repeated 3 times and remove was slower than discard once and discard was slower twice.

The code used for the experiment is as follows.

import timeit
from collections import Counter

setup = 'set_, keys = set(range(1000000)), iter(range(1000000))'

lst = []
for _ in range(3):
    t1 = timeit.repeat('set_.discard(next(keys))', setup)
    t2 = timeit.repeat('set_.remove(next(keys))', setup)
    lst.append(min(t1) < min(t2))

print("In 3 experiments, remove was slower than discard {} times".format(Counter(lst)[True]))
# In 3 experiments, remove was slower than discard 1 times
cottontail
  • 10,268
  • 18
  • 50
  • 51