1

I have a Series in pandas consisting of empty lists.

>>> s = pd.Series( [[]] * 20, index= range(0,20) )

I want to add tags to certain elements by index.

>>> for i in [1,3,5,7,11]:
...     s.loc[ i ].append('prime')

but this is what I keep getting:

>>> s
0     [prime, prime, prime, prime, prime]
1     [prime, prime, prime, prime, prime]
2     [prime, prime, prime, prime, prime]
3     [prime, prime, prime, prime, prime]
4     [prime, prime, prime, prime, prime]
5     [prime, prime, prime, prime, prime]
6     [prime, prime, prime, prime, prime]
7     [prime, prime, prime, prime, prime]
8     [prime, prime, prime, prime, prime]
9     [prime, prime, prime, prime, prime]
10    [prime, prime, prime, prime, prime]
11    [prime, prime, prime, prime, prime]
12    [prime, prime, prime, prime, prime]
13    [prime, prime, prime, prime, prime]
14    [prime, prime, prime, prime, prime]
15    [prime, prime, prime, prime, prime]
16    [prime, prime, prime, prime, prime]
17    [prime, prime, prime, prime, prime]
18    [prime, prime, prime, prime, prime]
19    [prime, prime, prime, prime, prime]
dtype: object

Which is not what I want.

I would like it to be like this:

>>> s
0     []
1     [prime]
2     []
3     [prime]
4     []
5     [prime]
6     []
7     [prime]
8     []
9     []
10    []
11    [prime]
12    []
...

I've been banging my head against the desk for an hour on this. Total pandas newb.

UPDATE

The following works as expected.

 s = pd.Series( [[]] * 20, index= range(0,20) )
>>> for i in [1,3,5,7,11]:
...     s.loc[ i ] = s.loc[ i ] + ['prime']

I will eventually want multiple 'tags' on each index, this is a fall back for the moment. I still would like to know why the append doesn't work.

rawkintrevo
  • 659
  • 5
  • 16
  • 1
    Not sure what you are trying to achieve here but you could just change your line to this: `s.iloc[ i ]=['prime']` – EdChum Jan 07 '15 at 22:22
  • 1
    Be warned: having lists (or, more generally, non-scalars) as elements in Series and DataFrames usually leads to headaches, even though it's sometimes useful as an intermediate step. – DSM Jan 07 '15 at 22:24

1 Answers1

1

Try this:

s = pd.Series([[] for _ in range(20)], index= range(0,20) )

Your problem is that instead of having several different empty lists you have a reference to the same list several times. An example so that you see the problem:

>>> lists = [[]] * 5
>>> lists
[[], [], [], [], []]
>>> lists[0].append(1)
>>> lists
[[1], [1], [1], [1], [1]]
elyase
  • 39,479
  • 12
  • 112
  • 119