Data Preprocessing :Missing values imputer value error

Question

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#importing
dataset = pd.read_csv("Data.csv")
x = dataset.iloc[:, 0:3].values
y = dataset.iloc[: , 3].values

#missingdata
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values='0',strategy='mean')
imputer = imputer.fit(x[:, 1:3])
x[:, 1:3] =  imputer.transform(x[:, 1:3])

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Please post the full traceback. – ewokx Aug 18 '20 at 06:17 — ewokx, Aug 18 '20 at 06:17

score 1 · Answer 1 · answered Aug 18 '20 at 10:41

If you have NaN values in your dataset, you can replace them by zero, like in the example bellow. You set inplace=True if you want to replace the values in the existing DataFrame rather than return a new one with the replacements.

x.fillna(0, inplace=True)

This is the official documentation, where you have more information about it.

score 0 · Answer 2 · answered Aug 18 '20 at 06:01

You need to check the input data values. Mostly, it has NaN values which need to be treated. You can check by checking all the unique values in all the columns. You can specifically remove NaN from dataframe. Refer to How to drop rows of Pandas DataFrame whose value in a certain column is NaN

score 0 · Answer 3 · edited Jun 06 '21 at 11:59

0

imputer = SimpleImputer(missing_values = np.nan, strategy = 'mean' ) # using the simputer
imputer = imputer.fit(X[:, 1:3]) # fiting the imputer in the x dataset
X[:, 1:3] = imputer.transform(X[:, 1:3]) # transforming the imputer in the x dataset

missing_values = np.nan

check the bold item

edited Jun 06 '21 at 11:59

Josef

2,869
2
22
23

answered Jun 06 '21 at 01:13

Rizwan Bin Sulaiman

1

Data Preprocessing :Missing values imputer value error

3 Answers3