193

What is the "one [...] obvious way" to add all items of an iterable to an existing set?

codeforester
  • 39,467
  • 16
  • 112
  • 140
Ian Mackinnon
  • 13,381
  • 13
  • 51
  • 67

6 Answers6

268

You can add elements of a list to a set like this:

>>> foo = set(range(0, 4))
>>> foo
set([0, 1, 2, 3])
>>> foo.update(range(2, 6))
>>> foo
set([0, 1, 2, 3, 4, 5])
SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
  • 2
    Just looked back at my interpreter session and I actually tried this, but thought that it had added the whole list as an element of the set because of the square brackets in the representation of the set. I had never noticed before that they're represented like that. – Ian Mackinnon Oct 28 '10 at 17:33
  • 7
    That representation allows you to paste it right back in an interactive session, because the `set` constructor takes an iterable as its argument. – Frank Kusters Apr 05 '13 at 07:02
  • 6
    Note that the representation is just e.g. `{1, 2, 3}` in Python 3 whereas it was `set([1, 2, 3])` in Python 2. – Resigned June 2023 Nov 26 '17 at 00:04
52

For the benefit of anyone who might believe e.g. that doing aset.add() in a loop would have performance competitive with doing aset.update(), here's an example of how you can test your beliefs quickly before going public:

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 294 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 950 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 458 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 598 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 1.89 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 891 usec per loop

Looks like the cost per item of the loop approach is over THREE times that of the update approach.

Using |= set() costs about 1.5x what update does but half of what adding each individual item in a loop does.

ArtOfWarfare
  • 20,617
  • 19
  • 137
  • 193
John Machin
  • 81,303
  • 11
  • 141
  • 189
17

You can use the set() function to convert an iterable into a set, and then use standard set update operator (|=) to add the unique values from your new set into the existing one.

>>> a = { 1, 2, 3 }
>>> b = ( 3, 4, 5 )
>>> a |= set(b)
>>> a
set([1, 2, 3, 4, 5])
gbc
  • 8,455
  • 6
  • 35
  • 30
  • 5
    Using `.update` has the benefit that the argument can be any iterable —not necessarily a set— unlike the RHS of the `|=` operator in your example. – tzot Oct 28 '10 at 21:31
  • 1
    Good point. It's just an aesthetic choice since set() can convert an iterable into a set, but the number of keystrokes are the same. – gbc Oct 28 '10 at 23:22
  • I have never seen that operator before, I'll enjoy using it when it pops up in the future; thanks! – eipxen Aug 01 '11 at 20:02
  • 1
    @eipxen: There's `|` for union, `&` for intersection, and `^` for getting elements that are in one or the other but not both. But in a dynamically typed language where it's sometimes difficult to read the code and know the types of objects flying around, I feel hesitant to use these operators. Someone who doesn't recognize them (or perhaps doesn't even realize that Python allows for operators like these) could be confused and think some weird bitwise or logical operations are going on. It'd be nice if these operators worked on other iterables, too... – ArtOfWarfare Mar 30 '15 at 19:00
  • Ran some time tests on this versus `.update()` and add individual elements in a loop. Found that `.update()` was faster. I added my results to this existing answer: http://stackoverflow.com/a/4046249/901641 – ArtOfWarfare Mar 30 '15 at 19:39
8

Just a quick update, timings using python 3:

#!/usr/local/bin python3
from timeit import Timer

a = set(range(1, 100000))
b = list(range(50000, 150000))

def one_by_one(s, l):
    for i in l:
        s.add(i)    

def cast_to_list_and_back(s, l):
    s = set(list(s) + l)

def update_set(s,l):
    s.update(l)

results are:

one_by_one 10.184448844986036
cast_to_list_and_back 7.969255169969983
update_set 2.212590195937082
Daniel Dror
  • 2,304
  • 28
  • 30
0

Use list comprehension.

Short circuiting the creation of iterable using a list for example :)

>>> x = [1, 2, 3, 4]
>>> 
>>> k = x.__iter__()
>>> k
<listiterator object at 0x100517490>
>>> l = [y for y in k]
>>> l
[1, 2, 3, 4]
>>> 
>>> z = Set([1,2])
>>> z.update(l)
>>> z
set([1, 2, 3, 4])
>>> 

[Edit: missed the set part of question]

pyfunc
  • 65,343
  • 15
  • 148
  • 136
-2
for item in items:
   extant_set.add(item)

For the record, I think the assertion that "There should be one-- and preferably only one --obvious way to do it." is bogus. It makes an assumption that many technical minded people make, that everyone thinks alike. What is obvious to one person is not so obvious to another.

I would argue that my proposed solution is clearly readable, and does what you ask. I don't believe there are any performance hits involved with it--though I admit I might be missing something. But despite all of that, it might not be obvious and preferable to another developer.

John Machin
  • 81,303
  • 11
  • 141
  • 189
jaydel
  • 14,389
  • 14
  • 62
  • 98
  • Argh! The for loop being on one line like that is formatting in my answer--I would never do that. Ever. – jaydel Oct 28 '10 at 17:28
  • You are absolutely correct. I edited the post to repair my damage. Thanks :) – jaydel Oct 28 '10 at 17:32
  • 10
    You are missing the point that `aset.update(iterable)` loops at C speed whereas `for item in iterable: aset.add(item)` loops at Python speed, with a method lookup and a method call (aarrgghh!!) per item. – John Machin Oct 28 '10 at 18:33
  • 2
    Sorry, he said nothing about performance in his question so I didn't worry about it. – jaydel Jul 14 '11 at 15:59