I have pandas Data Frame with following structure
item_condition_id category
brand_name category
price float64
shipping category
main_category category
category category
sub_category category
hashing_feature_aa float64
hashing_feature_ab float64
Example with portion of data:
brand_name shipping main_category category
Target 1 Women Tops & Blouses
unknown 1 Home Home Décor
unknown 0 Women Jewelry
unknown 0 Women Other
I have converted categorical (Strings) columns to numerical using below code.
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for i in range(len(X)):
X.iloc[:,i] = le.fit_transform(X.iloc[:,i])
After Conversion
brand_name shipping main_category category
0 1 1 3
1 1 0 0
1 0 1 1
1 0 1 2
This is working as expected but while trying apply inverse_transform to get the original categories from numerical categories it is throwing error.
for i in range(len(X)):
X.iloc[:,i] = le.inverse_transform(X.iloc[:,i])
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How to resolve this error in my case , what's wrong with my code ?
My goal is convert categorical (strings) features to numerical using Label Encoder in order to apply sklearn.feature_selection.SelectKbest.fit_transform(X,y), without label encoding this step is failing.
Thanks