-1
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#importing
dataset = pd.read_csv("Data.csv")
x = dataset.iloc[:, 0:3].values
y = dataset.iloc[: , 3].values

#missingdata
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values='0',strategy='mean')
imputer = imputer.fit(x[:, 1:3])
x[:, 1:3] =  imputer.transform(x[:, 1:3])

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

desertnaut
  • 57,590
  • 26
  • 140
  • 166

3 Answers3

1

If you have NaN values in your dataset, you can replace them by zero, like in the example bellow. You set inplace=True if you want to replace the values in the existing DataFrame rather than return a new one with the replacements.

x.fillna(0, inplace=True)

This is the official documentation, where you have more information about it.

Catalina Chircu
  • 1,506
  • 2
  • 8
  • 19
0

You need to check the input data values. Mostly, it has NaN values which need to be treated. You can check by checking all the unique values in all the columns. You can specifically remove NaN from dataframe. Refer to How to drop rows of Pandas DataFrame whose value in a certain column is NaN

awadhesh pathak
  • 121
  • 1
  • 4
0
imputer = SimpleImputer(missing_values = np.nan, strategy = 'mean' ) # using the simputer
imputer = imputer.fit(X[:, 1:3]) # fiting the imputer in the x dataset
X[:, 1:3] = imputer.transform(X[:, 1:3]) # transforming the imputer in the x dataset

missing_values = np.nan

check the bold item

Josef
  • 2,869
  • 2
  • 22
  • 23