1

I'm trying to clean a dataset and observed few features are of type : non-null Float type. The values contain - NaN

I tried below code :

cleaned_customer_data.fillna(cleaned_customer_data.mean()).head()

This result with 0 record.

Also, i tried -

cleaned_customer_data.fillna(cleaned_customer_data.mean())

It doesn't change NaN values to mean.

Data Sample :

FEATURE1
--------
NaN
2.0
NaN
NaN
NaN
1.294

Am i doing something wrong here, please guide.

Khilesh Chauhan
  • 739
  • 1
  • 10
  • 36

2 Answers2

0

First, calculate the mean for the required column:

mean_value=cleaned_customer_data['FEATURE1'].mean()

Then, fill the NaN values with the obtained mean:

cleaned_customer_data['FEATURE1'].fillna(value=mean_value, inplace=True)

Display your df:

cleaned_customer_data

Anna
  • 13
  • 3
  • It is same like `cleaned_customer_data = cleaned_customer_data.fillna(cleaned_customer_data.mean())` - It not working for OP. https://stackoverflow.com/questions/72470709/how-to-remove-nan-values-from-dataframe#comment128022576_72470709 – jezrael Jun 02 '22 at 05:05
  • Thanks Nina, Found the issue. The column values are all NaN. This occured after data pre-cleaning which resulted in dropping rows which had values in this features based on some condition. – Khilesh Chauhan Jun 02 '22 at 05:28
  • Please include an explanation with your answer to help readers understand how this works, and solves the problem. You can click the edit button at the bottom of your answer to add an explanation. Additionally, have a read of [how to write a good answer](https://stackoverflow.com/help/how-to-answer) – Freddy Mcloughlan Jun 02 '22 at 23:19
0

first, you need to calculate the mean :

mean_df = df.loc[df['FEATURE1'].notna()]['FEATURE1'].mean()

Then you assign the value of the mean where there is a NaN:

df.loc[df['FEATURE1'].isna(),'FEATURE1'] = mean_df