I have seen questions about the copy() method not working for nested data columns, since modifying something on the copy also altered the original dataframe. However, all I could find was about renaming a nested field of the dataframe on this question.
Nonetheless, I am not renaming anything, I am altering a field of the nested column. So just wanted to confirm if that also does alters the original dataframe despite a copy was done. If that would be the case, then how can I make a copied dataframe that doesn't affects the original for nested columns?
For example in this code, I have a dataframe with a column of dictionaries. Each dictionary just has one field that is an array, it was expected it was all integers but some floats slipped in, so I want to convert them all to integers without altering the original dataframe.
However, if I apply a user defined function on a copied dataframe it affects the original as well
df=pd.DataFrame({'a':[{'field':[1,2,3.0]},{'field':[1,2,4.0]},{'field':[1,2,5.0]}]})
print('printing the original dataframe: \n', df['a'])
def integer_converter(x):
x['a']['field']=[int(i) for i in x['a']['field']]
df2=df.copy(deep=True)
df2.apply(integer_converter,axis=1)
print('printing df2 after function: \n',df2['a'])
print('printing the original dataframe again: \n',df['a'])
The outputs were:
printing the original dataframe:
0 {'field': [1, 2, 3.0]}
1 {'field': [1, 2, 4.0]}
2 {'field': [1, 2, 5.0]}
Name: a, dtype: object
printing df2 after function:
0 {'field': [1, 2, 3]}
1 {'field': [1, 2, 4]}
2 {'field': [1, 2, 5]}
Name: a, dtype: object
printing the original dataframe again:
0 {'field': [1, 2, 3]}
1 {'field': [1, 2, 4]}
2 {'field': [1, 2, 5]}
Name: a, dtype: object