0

I'm using famous Titanic dataset for my first Kaggle problem. I'm getting stuck in dataset. I want to replace NaN values of Age gender wise e.g. missing values for 'male' should get replaced by average age of Male and vice-versea. While my code is running fine but getting an exception as following: "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self._update_inplace(new_data)"

import pandas as pd
df=pd.read_csv('train.csv')
df[(df['Sex']=='male') & (df['Age'].apply(np.isnan))]['Age'].fillna(df[df['Sex']=='male']['Age'].mean(),inplace=True)
Deepanshu
  • 99
  • 3
  • 11
  • It's not an exception, it's just a warning. There's plenty of info if you Google that warning text. In plenty of cases it makes no difference in getting the expected result. – roganjosh Jan 27 '18 at 14:41
  • Possible duplicate of [How to deal with SettingWithCopyWarning in Pandas?](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – roganjosh Jan 27 '18 at 14:45
  • Oops I just checked it thanks, but NaN values in Age are still not getting replaced: – Deepanshu Jan 27 '18 at 14:50
  • In any case, the answer you've been given is much more elegant than your existing code, dupe or not :) – roganjosh Jan 27 '18 at 14:51

1 Answers1

3
import pandas as pd
import numpy as np

df = pd.read_csv('train.csv')
df['Age'].fillna(df.groupby(["Sex"])["Age"].transform(np.mean), inplace=True)

Maybe this was something you were trying to do? I didn't get any warning though. Have a look at my blog post too if necessary.

higee
  • 402
  • 1
  • 7
  • 16