0

A bit confused about inplace argument. In particular if there is any benefit in using it vs the standard approach of just writing df=df to clarify that we change the dataframe we are working with?

  • 1
    Probably not. Maybe it doesn't harm and you like how the code reads, but check https://stackoverflow.com/questions/43893457/understanding-inplace-true-in-pandas and https://stackoverflow.com/questions/45570984/in-pandas-is-inplace-true-considered-harmful-or-not – Ignatius Reilly Sep 02 '23 at 06:24

2 Answers2

3

There is a big difference, if several names are pointing to the same object. df = df.do_something() does not work in place, it just reassigns the copy to the same name.

Here is an example demonstrating the difference.

Reassigning df, df2 is unchanged, it still points to the original object and df to the new output.

df = pd.DataFrame({'col1': [1, float('nan'), 3]})
df2 = df

df = df.dropna()
print(df2)

   col1
0   1.0
1   NaN
2   3.0

With inplace=True the object is modified in place.

df = pd.DataFrame({'col1': [1, float('nan'), 3]})
df2 = df

df.dropna(inplace=True)
print(df2)

   col1
0   1.0
2   3.0

In real life, I don't really see a use case to keep multiple reference to the same object. Given pandas complexity, it's sometimes hard to keep track of views and copies. I would recommend to avoid inplace=True. In fact, there is a discussion to remove this parameter for most functions.

mozway
  • 194,879
  • 13
  • 39
  • 75
0

inplace=True, write or edit the original data in RAM, and no need to return

inplace=False, create a new copy data in RAM, and edit it, and then return it

Jiu_Zou
  • 463
  • 1
  • 4