1

Having a hard time understanding why the apply function isn't working here. I'm trying to fill the null values for SalePrice with the mean sales price of their corresponding quality ratings (OverallQual)

I expected the function to itterate through each row and return the mean SalePrice for the coresponding OverallQual feature where SalePrice is a null, else return the original SalePrice.

sale_price_by_qual = df.groupby('OverallQual').mean()['SalePrice']

def fill_sales_price(SalePrice, OverallQual):
   if np.isnan(SalePrice):
      return sale_price_by_qual[SalePrice]
   else:
      return SalePrice

df[SalePrice] = df.apply(lambda x: fill_sales_price(x['SalePrice], x['OverallQaul]), axis=1)
  

KeyError: nan

2 Answers2

1

could you maybe save the mean value into a variable and then do the .fillna()?

x = your mean value

df[SalePrice] = df[SalePrice].fillna(x)
codingrainha
  • 119
  • 1
  • 11
0

Try this,

def fill_sales_price(SalePrice, OverallQual):
  if np.isnan(SalePrice):
     return sale_price_by_qual[OverallQual]
  else:
     return SalePrice

df['SalePrice'] = df.apply(lambda x: fill_sales_price(x['SalePrice'], x['OverallQual']), axis=1)