I am doing a very simple transformation on a pandas dataframe by using a function, but I didn't expect the function will change the input dataframe (but it did). I was wondering why...
Here's my code:
x = pd.DataFrame({'a': [1,2,3], 'b': [3,4,5]})
def transform(df, increment):
new_df = df
new_df.a = new_df.a + increment
return new_df
new_x = transform(x, 1)
new_x # output shows new_x.a is [2,3,4], which is expected.
x # output shows x.a is also [2,3,4]. I thought it should be [1,2,3]
Why is this the case? I think, in the function, all the operations are executed on the new_df
, so the input x
should stay exactly the same before and after I ran this transform
function, isn't it?