0
import numpy as np
import pandas as pd

df = pd.DataFrame([[np.nan, 2, 1, 0],
                [3, 4, np.nan, 1],
                [np.nan, np.nan, 8, 5],
                [np.nan, 3, np.nan, 4]],
                columns=list('ABCD'))
df2 = df
df.fillna(value = df.mean(), inplace=True)

Now df2 and df are identical. How do I avoid changing df2?

apkul
  • 103
  • 2
  • 8
  • Assignment statements in Python do not copy objects, they create bindings between a target and an object. Check this. https://stackoverflow.com/questions/21537078/unexpected-list-behavior-in-python – Orhan Solak Apr 25 '18 at 23:27

3 Answers3

0

Consider making a copy of df using copy method: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.copy.html

zoran119
  • 10,657
  • 12
  • 46
  • 88
0

You're "pointing" df2 to the object that df points to. As such, they will be the same.

(The following is from the Python docs).

Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.

To copy a dataframe, do: df2 = df.copy();

njha
  • 1,118
  • 1
  • 13
  • 24
0

Thanks for the responses. To summarize, inplace=True will modify any other views on the object. In my example, to avoid modifying df2, I should use df2 = df.copy() instead of df2 = df

apkul
  • 103
  • 2
  • 8