retrieve feature names from fit_transform?

Asked Nov 13 '20 at 09:42

Active Nov 14 '20 at 18:04

Viewed 143 times

i am using the following:

preprocess = make_column_transformer(
    (MinMaxScaler(),numeric_cols),
    (OneHotEncoder(handle_unknown='ignore'),['country'])
    )
pipeline = make_pipeline(preprocess,XGBClassifier())
param_grid =    { 
                  'xgbclassifier__learning_rate': [0.01,0.005,0.001],
                 
                  }


clf = GridSearchCV(pipeline,param_grid = param_grid,scoring = 'roc_auc',
                                 verbose= 1,iid= True,
                                     refit = True,cv  = 3)
clf.fit(X_train,y_train)


model_best =clf.best_estimator_

when i then do:

model_best.fit_transform(X_test)

i get the error:

ValueError: y should be a 1d array, got an array of shape () instead.

when i do the below however

pd.DataFrame(pipeline[0].fit_transform(X_test))

i do get a dataframe, but how i can i fix back the feature names?? since this contains the ohe variables i am unsure how to do it, i cannot just do x_train.columns.tolist()

edited Nov 14 '20 at 18:04

StupidWolf

45,075
17
40
72

asked Nov 13 '20 at 09:42

mathella

I think you need to try some like this: https://stackoverflow.com/questions/57528350/can-you-consistently-keep-track-of-column-labels-using-sklearns-transformer-api – StupidWolf Nov 14 '20 at 18:21
otherwise might be easier to use pd.get_dummies to transform your data frame first – StupidWolf Nov 14 '20 at 18:21

retrieve feature names from fit_transform?

0 Answers0