How to remove NaN values from dataframe

Question

I'm trying to clean a dataset and observed few features are of type : non-null Float type. The values contain - NaN

I tried below code :

cleaned_customer_data.fillna(cleaned_customer_data.mean()).head()

This result with 0 record.

Also, i tried -

cleaned_customer_data.fillna(cleaned_customer_data.mean())

It doesn't change NaN values to mean.

Data Sample :

FEATURE1
--------
NaN
2.0
NaN
NaN
NaN
1.294

Am i doing something wrong here, please guide.

Can you add some data sample? Is possible missing values are strings? what is `print (cleaned_customer_data[cleaned_customer_data.isna().any(axis=1)])` ? — jezrael, Jun 02 '22 at 04:35
So `print (cleaned_customer_data[cleaned_customer_data.isna().any(axis=1)])` return rows with missing values? — jezrael, Jun 02 '22 at 04:39
hmmm, do you assign back? `cleaned_customer_data = cleaned_customer_data.fillna(cleaned_customer_data.mean())` ? — jezrael, Jun 02 '22 at 04:40
Yes, tried - cleaned_customer_data=cleaned_customer_data.fillna(cleaned_customer_data.mean()) cleaned_customer_data.head() and still the same. — Khilesh Chauhan, Jun 02 '22 at 04:45
So `print (df.dtypes)` are floats for all columns which are not removed NaNs? — jezrael, Jun 02 '22 at 04:59
Found the issue. The column values are all NaN. This occured after data pre-cleaning which resulted in dropping rows which had values in this features based on some condition. Thanks — Khilesh Chauhan, Jun 02 '22 at 05:27

Anna · Answer 1 · 2022-06-03T13:05:08.197

0

First, calculate the mean for the required column:

mean_value=cleaned_customer_data['FEATURE1'].mean()

Then, fill the NaN values with the obtained mean:

cleaned_customer_data['FEATURE1'].fillna(value=mean_value, inplace=True)

Display your df:

cleaned_customer_data

edited Jun 03 '22 at 13:05

answered Jun 02 '22 at 05:03

Anna

It is same like `cleaned_customer_data = cleaned_customer_data.fillna(cleaned_customer_data.mean())` - It not working for OP. https://stackoverflow.com/questions/72470709/how-to-remove-nan-values-from-dataframe#comment128022576_72470709 – jezrael Jun 02 '22 at 05:05
Thanks Nina, Found the issue. The column values are all NaN. This occured after data pre-cleaning which resulted in dropping rows which had values in this features based on some condition. – Khilesh Chauhan Jun 02 '22 at 05:28
Please include an explanation with your answer to help readers understand how this works, and solves the problem. You can click the edit button at the bottom of your answer to add an explanation. Additionally, have a read of [how to write a good answer](https://stackoverflow.com/help/how-to-answer) – Freddy Mcloughlan Jun 02 '22 at 23:19

score 0 · Answer 2 · answered Jun 02 '22 at 08:35

0

first, you need to calculate the mean :

mean_df = df.loc[df['FEATURE1'].notna()]['FEATURE1'].mean()

Then you assign the value of the mean where there is a NaN:

df.loc[df['FEATURE1'].isna(),'FEATURE1'] = mean_df

answered Jun 02 '22 at 08:35

2 Answers2