Error in prediction in svm classifier after one hot encoding

Question

I have used one hot encoding to my dataset before training my SVM classifier. which increased number of features in training set to 982. But during prediction of test dataset which has 7 features i am getting error " X has 7 features per sample; expecting 982". I don't understand how to increase number of features in test dataset.

My code is:
df = pd.read_csv('train.csv',header=None);
features = df.iloc[:,:-1].values
labels = df.iloc[:,-1].values
encode = LabelEncoder()
features[:,2] = encode.fit_transform(features[:,2])
features[:,3] = encode.fit_transform(features[:,3])
features[:,4] = encode.fit_transform(features[:,4])
features[:,5] = encode.fit_transform(features[:,5])

df1 = pd.DataFrame(features)
#--------------------------- ONE HOT ENCODING --------------------------------#

hotencode = OneHotEncoder(categorical_features=[2])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[14])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[37])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[466])
features = hotencode.fit_transform(features).toarray()
X = np.array(features)
y = np.array(labels)

clf = svm.LinearSVC()
clf.fit(X,y)
d_test = pd.read_csv('query.csv')
Z_test =np.array(d_test)
confidence = clf.predict(Z_test)
print("The query image belongs to Class ")
print(confidence)

######################### test dataset
query.csv
1   0.076   1   3232236298  2886732679  3128    60604

score 0 · Accepted Answer · answered May 22 '18 at 09:10

The short answer: you need to apply the same OHE transform (or LE+OHE in your case) on the test set.

For a good advice, see Scikit Learn OneHotEncoder fit and transform Error: ValueError: X has different shape than during fitting or How to deal with imputation and hot one encoding in pandas?

Error in prediction in svm classifier after one hot encoding

1 Answers1