0

I have a dataframe like this one. I need to replace NaN with median value, based on animal type. For example I need to calculate a median for cats and then replace only cats with NaN with this value. Is there a way to do this in one command or I need to do it manually for each type?

  animal  age  weight priority
a    cat  2.5       1      yes
b    cat  1.0       3      yes
c    dog  0.5       6       no
d    dog  NaN       8      yes
e    cat  5.0       4       no
f    cat  2.0       3       no
g    dog  3.5      10       no
h    cat  NaN       2      yes
i    dog  7.0       7       no
j    dog  3.0       3       no
Daniel Kelvich
  • 25
  • 1
  • 2
  • 7

1 Answers1

1

Use GroupBy.transform for median for groups with same size as original DataFrame, so is possible use fillna for replace NaNs:

df['age'] = df['age'].fillna(df.groupby('animal')['age'].transform('median'))
print (df)
  animal   age  weight priority
a    cat  2.50       1      yes
b    cat  1.00       3      yes
c    dog  0.50       6       no
d    dog  3.25       8      yes
e    cat  5.00       4       no
f    cat  2.00       3       no
g    dog  3.50      10       no
h    cat  2.25       2      yes
i    dog  7.00       7       no
j    dog  3.00       3       no

Detail:

print (df.groupby('animal')['age'].transform('median'))
a    2.25
b    2.25
c    3.25
d    3.25
e    2.25
f    2.25
g    3.25
h    2.25
i    3.25
j    3.25
Name: age, dtype: float64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252