1

Has anyone else seen this behavior before?

I've got a short code snippet below:

 import pandas as pd

 df1= pd.DataFrame({'a':[1,2], 'b': [10,20]})
 df2=df1 
 df2['newcol']=1
 print('df1\n',df1)
 print('df2\n',df2)

All day I've been getting very strange behaviour. The output is:

 df1
 a   b  newcol
 0  1  10       1
 1  2  20       1

 df2
    a   b  newcol
 0  1  10       1
 1  2  20       1

For some weird reason df1 has the new column as well as df2!!

I'm running pycharm and python 3.5.2

I've never had this problem before. I've tried reinstalling pycharm, rebooting everything I can think of!

It seems to be something to do with the fact that I copied df2 from df1 but what is going on here? and how do I stop it!

A Rob4
  • 1,278
  • 3
  • 17
  • 35

1 Answers1

1

For new mutable object in python (here DataFrame) need copy:

df2 = df1.copy()

Better explanation is here.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252