A bit confused about inplace argument. In particular if there is any benefit in using it vs the standard approach of just writing df=df to clarify that we change the dataframe we are working with?
-
1Probably not. Maybe it doesn't harm and you like how the code reads, but check https://stackoverflow.com/questions/43893457/understanding-inplace-true-in-pandas and https://stackoverflow.com/questions/45570984/in-pandas-is-inplace-true-considered-harmful-or-not – Ignatius Reilly Sep 02 '23 at 06:24
2 Answers
There is a big difference, if several names are pointing to the same object. df = df.do_something()
does not work in place, it just reassigns the copy to the same name.
Here is an example demonstrating the difference.
Reassigning df
, df2
is unchanged, it still points to the original object and df
to the new output.
df = pd.DataFrame({'col1': [1, float('nan'), 3]})
df2 = df
df = df.dropna()
print(df2)
col1
0 1.0
1 NaN
2 3.0
With inplace=True
the object is modified in place.
df = pd.DataFrame({'col1': [1, float('nan'), 3]})
df2 = df
df.dropna(inplace=True)
print(df2)
col1
0 1.0
2 3.0
In real life, I don't really see a use case to keep multiple reference to the same object. Given pandas complexity, it's sometimes hard to keep track of views and copies. I would recommend to avoid inplace=True
. In fact, there is a discussion to remove this parameter for most functions.

- 194,879
- 13
- 39
- 75
-
1Interesting, had no idea that inplace can actually create such a difference. Thank you for the clarification. – ProjectAvatar Sep 02 '23 at 08:58
inplace=True, write or edit the original data in RAM, and no need to return
inplace=False, create a new copy data in RAM, and edit it, and then return it

- 463
- 1
- 4