So I found a workaround :
First made b
a new dataframe and copied the column b
from dataframe a. Then created an empty column a
.
b=pd.DataFrame()
b["b"]=a["b"]
b["a"]=None
Now we can iterate over the rows of b
and copy the column a
's values from dataframe a
. Note that we set the type to object so that we can assign lists to entries of the dataframe, which otherwise will throw ValueError
b =b.astype(object)
for i in range(len(b)):
b.loc[i,"a"]=copy.deepcopy(a.loc[i,"a"])
Then make your required changes :
b.a[0][0] = 0
Now we have b
:
b a
0 3 [0, 1]
1 2 [2, 2]
2 1 [3, 3]
And a
remained unchanged
a b
0 [1, 1] 3
1 [2, 2] 2
2 [3, 3] 1
Now let's see what might have been the issue with using copy
. I used the id function after applying copy
and the results were:
b=copy.copy(a)
print(id(a)==id(b)) #False
print(id(a["a"])==id(b["a"])) #False
print(id(a.loc[0,"a"])==id(b.loc[0,"a"])) #True
Since the column a has lists in it, both the dataframes refer to the same list and modifying one affected the other. This was the intuition behind iteratively copying the lists one by one. Note that the same results were seen even on using b=copy.deepcopy(a)