I am new to python and am absolutely foxed by why the following happens -
- I start with a dataframe df1
- I make a copy of it and call it df2
- I change a value in the copy (df2)
- That changes the value in df1 also!
Here is a modified version of code I found in another question on stackoverflow (original question is here: Replace single value in a pandas dataframe, when index is not known and values in column are unique):
# Create a dataframe df1
df1 = pd.DataFrame([[5, 2], [3, 4]], columns=('a', 'b'))
#print df1
df1
a b
0 5 2
1 3 4
# copy it into df2
df2=df1
#print df2
df2
a b
0 5 2
1 3 4
# modify the value in df2 in column b where column a is 3
df2.loc[df2.a == 3, 'b'] = 6
# print df2 to check that the value has changed
df2
a b
0 5 2
1 3 6
# BUT changing df2 changed df1 also! Print df1
df1
a b
0 5 2
1 3 6
Can someone please explain this? Thanks