0

i am using the following:

preprocess = make_column_transformer(
    (MinMaxScaler(),numeric_cols),
    (OneHotEncoder(handle_unknown='ignore'),['country'])
    )
pipeline = make_pipeline(preprocess,XGBClassifier())
param_grid =    { 
                  'xgbclassifier__learning_rate': [0.01,0.005,0.001],
                 
                  }


clf = GridSearchCV(pipeline,param_grid = param_grid,scoring = 'roc_auc',
                                 verbose= 1,iid= True,
                                     refit = True,cv  = 3)
clf.fit(X_train,y_train)


model_best =clf.best_estimator_

when i then do:

model_best.fit_transform(X_test)

i get the error:

ValueError: y should be a 1d array, got an array of shape () instead.

when i do the below however

pd.DataFrame(pipeline[0].fit_transform(X_test)) 

i do get a dataframe, but how i can i fix back the feature names?? since this contains the ohe variables i am unsure how to do it, i cannot just do x_train.columns.tolist()

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
mathella
  • 91
  • 7
  • I think you need to try some like this: https://stackoverflow.com/questions/57528350/can-you-consistently-keep-track-of-column-labels-using-sklearns-transformer-api – StupidWolf Nov 14 '20 at 18:21
  • otherwise might be easier to use pd.get_dummies to transform your data frame first – StupidWolf Nov 14 '20 at 18:21

0 Answers0