1

I have created a list of 2 dataframes

df = [pd.DataFrame(0, columns = ['a', 'b', 'c'], index = range(0, 5))] * 2

df[0]['a'][1] = 55

Current output:

[    a  b  c
 0   0  0  0
 1  55  0  0
 2   0  0  0
 3   0  0  0
 4   0  0  0,     a  b  c
 0   0  0  0
 1  55  0  0
 2   0  0  0
 3   0  0  0
 4   0  0  0]

Expected output:

[    a  b  c
 0   0  0  0
 1  55  0  0
 2   0  0  0
 3   0  0  0
 4   0  0  0,     a  b  c
 0   0  0  0
 1   0  0  0
 2   0  0  0
 3   0  0  0
 4   0  0  0]

I would expect this code snippet to assign a value of 1 to the first df, col a and index 1. However it assigns values to both dataframes.

I am not sure if the issue arises because I created the list by multiplying a df by 2, which is resulting in applying the result to all the dfs.

Is there a way to create a list of multiple dfs and have result attached to only one df based on index.

Thanks

Moshee
  • 544
  • 1
  • 3
  • 16
  • 1
    When you multiply a list like that you're not making more copies, you're making duplicate refrences to the same object. So they all point to the same thing it's just showing it 2x – Jab Oct 21 '19 at 14:40

2 Answers2

1

Use list comprehension instead:

df = [pd.DataFrame(0, columns = ['a', 'b', 'c'], index = range(0, 5)) for _ in range(2)]

df[0]['a'][1] = 55

#
[    a  b  c
0   0  0  0
1  55  0  0
2   0  0  0
3   0  0  0
4   0  0  0,    a  b  c
0  0  0  0
1  0  0  0
2  0  0  0
3  0  0  0
4  0  0  0]
Henry Yik
  • 22,275
  • 4
  • 18
  • 40
1

You created a shallow copy. Try this:

df = [pd.DataFrame(0, columns = ['a', 'b', 'c'], index = range(0, 5)) for _ in range(2)]
Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Carsten
  • 2,765
  • 1
  • 13
  • 28