I Know that the fillna()
method can be used to fill NaN in whole dataframe.
df.fillna(df.mean()) # fill with mean of column.
How to limit mean calculation to the group (and the column) where the NaN is.
Exemple:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'a': pd.Series([1,1,1,2,2,2]),
'b': pd.Series([1,2,np.NaN,1,np.NaN,4])
})
print df
Input
a b
0 1 1
1 1 2
2 1 NaN
3 2 1
4 2 NaN
5 2 4
Output (after groupby('a') & replace NaN by mean of group)
a b
0 1 1.0
1 1 2.0
2 1 1.5
3 2 1.0
4 2 2.5
5 2 4.0