0

I have some NaN values in my pd dataframe and I wish to replace them with the median value of the column. I am aware that this is very similar to: pandas DataFrame: replace nan values with average of columns

However when I try the equivalent:

df = df.fillna(df.median())

I get the following error:


Python\Python37\site-packages\pandas\core\generic.py:6287: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._update_inplace(new_data)
Python\Python37\site-packages\pandas\core\frame.py:4244: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  **kwargs

Any advice is appreciated. Thanks

Agustin
  • 1,458
  • 1
  • 13
  • 30
  • `A value is trying to be set on a copy of a slice from a DataFrame`: is your `df` a part of some bigger dataframe? – Quang Hoang Oct 16 '19 at 16:01
  • do `df = df.copy()` above that line. You can read extensively about it in https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – ALollz Oct 16 '19 at 16:02
  • Yes, I extracted it from a larger one. How can I extract it without getting this error? I used this initially: df = data.iloc[:, 0:-1] – Agustin Oct 16 '19 at 16:03
  • agreed with @ALollz, do a copy of your dataframe. Additional resource: https://www.dataquest.io/blog/settingwithcopywarning/ – florian Oct 16 '19 at 16:03
  • @ALollz that did not work unfortunately. – Agustin Oct 16 '19 at 16:04
  • 1
    If you want to update `data`, do `data.iloc[:,0:-1] = df.fillna(df.median)`. If not, you should include an [MRE](https://stackoverflow.com/help/minimal-reproducible-example). – Quang Hoang Oct 16 '19 at 16:07
  • Perfect, that worked. Thanks! – Agustin Oct 16 '19 at 16:12

1 Answers1

0

Try df.fillna(df.median(axis=1, skipna=True))

xtian
  • 98
  • 9