116

What distinguishes - and .difference() on sets? Obviously the syntax is not the same. One is a binary operator, and the other is an instance method. What else?

s1 = set([1,2,3])
s2 = set([3,4,5])

>>> s1 - s2
set([1, 2])
>>> s1.difference(s2)
set([1, 2])
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
David542
  • 104,438
  • 178
  • 489
  • 842

3 Answers3

139

set.difference, set.union... can take any iterable as the second arg while both need to be sets to use -, there is no difference in the output.

Operation         Equivalent   Result
s.difference(t)   s - t        new set with elements in s but not in t

With .difference you can do things like:

s1 = set([1,2,3])

print(s1.difference(*[[3],[4],[5]]))

{1, 2}

It is also more efficient when creating sets using the *(iterable,iterable) syntax as you don't create intermediary sets, you can see some comparisons here

Nam G VU
  • 33,193
  • 69
  • 233
  • 372
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 1
    I guess the next question is... why? Why isn't set subtraction defined to be the same as set difference? – user48956 Feb 26 '19 at 18:50
  • 2
    @user48956 `set_a - set_b` is defined by the magic (or dunder) method, `__sub__` and is equivalent to `set_a.__sub__(set_b)`. As such, the difference operator is dependent on the class of the leftmost object. From there, it's all implementation details, however obscure, like those in the answer of @Abhijit – Vessel Dec 20 '22 at 01:53
24

At a glance, it may not be quite evident from the documentation, but buried deep inside a paragraph it is dedicated to differentiate the method call with the operator version:

Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like set('abc') & 'cbs' in favor of the more readable set('abc').intersection('cbs').

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Abhijit
  • 62,056
  • 18
  • 131
  • 204
18

The documentation appears to suggest that difference can take multiple sets, so it is possible that it might be more efficient and clearer for things like:

s1 = set([1, 2, 3, 4])
s2 = set([2, 5])
s3 = set([3, 6])
s1.difference(s2, s3) # instead of s1 - s2 - s3

but I would suggest some testing to verify.