Looking at the documentation of the OneHotEncoder
there doesn't seem to be a way to include the feature names as a prefix of the OneHot vectors. Does anyone know of a way around this? Am I missing something?
Sample dataframe:
df = pd.DataFrame({'a':['c1', 'c1', 'c2', 'c1', 'c3'], 'b':['c1', 'c4', 'c1', 'c1', 'c1']})
from sklearn.preprocessing import OneHotEncoder
onehot = OneHotEncoder()
onehot.fit(df)
onehot.get_feature_names()
array(['x0_c1', 'x0_c2', 'x0_c3', 'x1_c1', 'x1_c4'], dtype=object)
Where given that the encoder is fed a dataframe I'd expect the possibility to obtain something like:
array(['a_c1', 'a_c2', 'a_c3', 'b_c1', 'b_c4'], dtype=object)