38

Python sets have these methods:

s.union(t)  s | t   new set with elements from both s and t

s.update(t) s |= t  return set s with elements added from t

Likewise, there's also these:

s.intersection_update(t)    s &= t  return set s keeping only elements also found in t

s.intersection(t)   s & t   new set with elements common to s and t

And so on, for all the standard relational algebra operations.

What exactly is the difference here? I see that it says that the update() versions returns s instead of a new set, but if I write x = s.update(t), does that means that id(x) == id(s)? Are they references to the same object now?

Why are both sets of methods implemented? It doesn't seem to add any significant functionality.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
temporary_user_name
  • 35,956
  • 47
  • 141
  • 220

3 Answers3

56

They are very different. One set changes the set in place, while the other leaves the original set alone, and returns a copy instead.

>>> s = {1, 2, 3}
>>> news = s | {4}
>>> s
set([1, 2, 3])
>>> news
set([1, 2, 3, 4])

Note how s has remained unchanged.

>>> s.update({4})
>>> s
set([1, 2, 3, 4])

Now I've changed s itself. Note also that .update() didn't appear to return anything; it did not return s to the caller and the Python interpreter did not echo a value.

Methods that change objects in-place never return the original in Python. Their return value is always None instead (which is never echoed).

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 4
    Ha, 14 months later how this works is so blatantly obvious that it's amazing to me that I had to ask. Thanks though....it's a process. – temporary_user_name Mar 29 '14 at 07:29
  • 1
    Note that `s1.update(s2)` is equivalent to `s1|=s2`. I wonder if there's any performance difference between the two. – Yibo Yang Nov 09 '16 at 22:31
  • 5
    @YiboYang: I ran a `timeit` trial; there is not much difference; `s1|=s2` is fractionally slower (about 1-2% so). Take into account that augmented `|=` assignment does really assign back to `s1` even if that object was first updated in-place (which can have consequences if the target name was a class attribute first, or an element of a immutable object). – Martijn Pieters Nov 10 '16 at 11:30
2

The _update methods modify the set in-place and return None. The methods without update return a new object. You almost certainly do not want to do x = s.update(t), since that will set x to None.

>>> x = set([1, 2])
>>> x.intersection(set([2, 3]))
set([2])
>>> x
set([1, 2])
>>> x.intersection_update(set([2, 3]))
>>> x
set([2])
>>> x = x.intersection_update(set([2, 3]))
>>> print x
None

The functionality added by the _update methods is the ability to modify existing sets. If you share a set between multiple objects, you may want to modify the existing set so the other objects sharing it will see the changes. If you just create a new set, the other objects won't know about it.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
1

It looks like the docs don't state it in the clearest way possible, but set.update doesn't return anything at all (which is equivalent to returning None), neither does set.intersection_update. Like list.append or list.extend or dict.update, they modify the container in place.

In [1]: set('abba')
Out[1]: set(['a', 'b'])

In [2]: set('abba').update(set('c'))

In [3]: 

Edit: actually, the docs don't say what you show in the question. They say:

Update the set, adding elements from all others.

and

Update the set, keeping only elements found in it and all others.

Lev Levitsky
  • 63,701
  • 20
  • 147
  • 175