16

I am having trouble with a list of sets and I think it is because I initialized it wrong, is this a valid way to initialize and add to a list of 5000 sets?

sets = [set()]*5000
i = 0
for each in f:
    line = each.split()
    if (some statement):
        i = line[1]

    else:
        sets[i].add(line[0])

any advice would be much appreciated

dawg
  • 98,345
  • 23
  • 131
  • 206
Alex Brashear
  • 864
  • 3
  • 9
  • 15

2 Answers2

28

You are storing a copy of reference to a single set in each of your list indices. So, modifying one will change the others too.

To create a list of multiple sets, you can use list comprehension:

sets = [set() for _ in xrange(5000)]
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
10

This works:

>>> lotsosets=[set() for i in range(5)]
>>> lotsosets
[set([]), set([]), set([]), set([]), set([])]
>>> lotsosets[0].add('see me?')
>>> lotsosets
[set(['see me?']), set([]), set([]), set([]), set([])]
>>> lotsosets[1].add('imma here too')
>>> lotsosets
[set(['see me?']), set(['imma here too']), set([]), set([]), set([])]

You should only use the form [x]*5000 if x is something immutable:

>>> li=[None]*5
>>> li
[None, None, None, None, None]
>>> li[0]=0
>>> li
[0, None, None, None, None]
>>> li[1]=1
>>> li
[0, 1, None, None, None]

Or if having multiple references to a single item, like an iterator, produces desired behavior:

>>> [iter('abc')]*3
[<iterator object at 0x100498410>, 
 <iterator object at 0x100498410>, 
 <iterator object at 0x100498410>]   # 3 references to the SAME object

Note the repeated reference to the same iterator which then produces a desired behavior with zip:

>>> zip(*[iter('abcdef')]*3)
[('a', 'b', 'c'), ('d', 'e', 'f')]

Or a subset of a longer iterator:

>>> [next(x) for x in [iter('abcdef')]*3]
['a', 'b', 'c']

Whereas something like [list()]*5 is probably not producing what is intended:

>>> li=[list()]*5
>>> li
[[], [], [], [], []]
>>> li[0].append('whoa')
>>> li
[['whoa'], ['whoa'], ['whoa'], ['whoa'], ['whoa']]
dawg
  • 98,345
  • 23
  • 131
  • 206