ValueError: could not convert string to float. During Imputing missing data

Question

I am working on Melbourne housing dataset and during the pre processing I'm trying to impute missing data using the Mean / median strategy. I tried using Imputer from Sklearn.preprocessing.

imp = Imputer( strategy='mean' )
dataset = imp.fit(dataset)

Upon running this I encountered this error.

ValueError: could not convert string to float: 'Western Metropolitan'

I am aware that the imputing takes place only in float values but I need to do either of the 2:

1) Impute only values other than string in the dataset

2) Impute data with string

I could not find any kind of solution online. Thanks in advance.

Eh you shouldn't try to even impute strings. Use the columns without strings. Or better yet based on the model you'd be working on, drop the rows with empty values (assuming very high accuracy isn't the target) A few measly unclear values wouldn't even make much difference. Or you might even train a separate model to impute (aka predict) those fields. — Souyama, Mar 17 '19 at 14:25
The below link has explanation for your issue: https://stackoverflow.com/questions/25239958/impute-categorical-missing-values-in-scikit-learn — Giri, Mar 17 '19 at 14:32
I referred to this question thread. That did help me fix the problem. Thanks alot! @Giri — Umang Mistry, Mar 18 '19 at 08:28
Does this answer your question? [Impute categorical missing values in scikit-learn](https://stackoverflow.com/questions/25239958/impute-categorical-missing-values-in-scikit-learn) — zhrist, Mar 31 '23 at 07:59

score 0 · Answer 1 · answered Mar 17 '19 at 14:24

0

Python doesn't handle categorical variables very well. You need to dummify all your category variables in order to impute the missing values. Even if one column is category,the error pops out.

answered Mar 17 '19 at 14:24

sandeep patil

204
2
7

score 0 · Answer 2 · answered Mar 27 '23 at 13:07

0

Use strategy="most_frequent" or strategy="constant"

answered Mar 27 '23 at 13:07

Navneet Tiwari

1

ValueError: could not convert string to float. During Imputing missing data

2 Answers2