3

I have a column in a pandas dataframe dfr in which there is a empty list. When I try to append it, the entire column is changed.

Below is the code attached.

N = 10
Nr = list(range(10))
dfr = pd.DataFrame(Nr,columns = ['ID'])
dfr['Assignment'] = [[]] * dfr.shape[0]
for i in range(N):
    dfr.loc[i][1].append(i)    
dfr

Now when I run this, the whole assignment column changes. Can anyone help me here. I just need to have 1 value of i in the list in each row.

enter image description here

Kevin Hernandez
  • 1,270
  • 2
  • 19
  • 41

2 Answers2

1

Easier and faster than iterating through the full dataframe, is to create a series with the desired values. If I understand correctly, this matches your expected output:

assignment = []
for i in range(N):
    assignment.append([i])
dfr['Assignment'] = assignment
print(dfr)

Output:

  ID Assignment
0   0        [0]
1   1        [1]
2   2        [2]
3   3        [3]
4   4        [4]
5   5        [5]
6   6        [6]
7   7        [7]
8   8        [8]
9   9        [9]
Celius Stingher
  • 17,835
  • 6
  • 23
  • 53
  • The expected output matches however there are certain operations to be done due to which I need to iterate over the dataframe and update the list as needed. – Amogh Bhosekar Feb 20 '20 at 19:49
  • 2
    The thing is, if you create one list as the value for the column, and later iterate over this object, then it will reflect in all other rows, because the original list is the same. That's what AMC hinted. – Celius Stingher Feb 20 '20 at 19:57
0

As mentioned by @AMC, the reason why this is happening is that the lists in your dataframe cells are identical. As a result, when you are iterating over the dataframe cells every time you are appending a number to the same list. Therefore, I suggest you to create one list per cell as follow:

for i in range(N):
    dfr.at[i,'Assignment'] = [i]

   ID Assignment
0   0        [0]
1   1        [1]
2   2        [2]
3   3        [3]
4   4        [4]
5   5        [5]
6   6        [6]
7   7        [7]
8   8        [8]
9   9        [9]

Then you can update these cells, independently:

for i in range(N):
    dfr.at[i,'Assignment'].append(i+1) 

   ID Assignment
0   0     [0, 1]
1   1     [1, 2]
2   2     [2, 3]
3   3     [3, 4]
4   4     [4, 5]
5   5     [5, 6]
6   6     [6, 7]
7   7     [7, 8]
8   8     [8, 9]
9   9    [9, 10]
user3806649
  • 1,257
  • 2
  • 18
  • 42