How to build pipeline with grid search cv and early stopping method?

Question

Here is my code. It start with pipeline (standardizing,replace null value,onehotencoding and selectkbest) with lightgbm model to fit my data.

numeric_features = ['X10','X11', 'X12', 'X13', 'X14']
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])

categorical_features = ['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9']
categorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='constant', fill_value='FLAG_NAN')),('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_features),('cat', categorical_transformer, categorical_features)])

pipe = Pipeline(steps=[('preprocessor', preprocessor),('selector', SelectKBest(mutual_info_classif, k=5)),('classifier',LGBMClassifier())])
   
search_space = dict(classifier =[LGBMClassifier()])

X_train = train.drop(columns=['Y'])
X_test   = test.drop(columns=['Y'])
y_train = train['Y']
y_test  = test['Y'] 

grid_search_pipe = 
GridSearchCV(estimator=pipe,param_grid=search_space,scoring="neg_mean_squared_error",cv=5)

grid_search_pipe.fit(X_train, y_train, classifier__early_stopping_rounds=10, classifier__eval_metric="rmse", classifier__eval_set=[[X_test, y_test]])

And I got this error

ValueError: DataFrame.dtypes for data must be int, float or bool.
Did not expect the data types in the following fields: X1, X2, X3, X4, X5, X6, X7, X8, X9

My data has some categorical column.

Irrelevant to your issue, but before combining CV with early stopping, you may want to have a look here: [Early stopping with Keras and sklearn GridSearchCV cross-validation](https://stackoverflow.com/questions/48127550/early-stopping-with-keras-and-sklearn-gridsearchcv-cross-validation/48139341#48139341) — desertnaut, Aug 27 '20 at 09:34
I have another question after reading your answer. In my code, which data early stopping apply on between test data and validation data in gridsearchcv? — Tanakorn Taweepoka, Aug 28 '20 at 04:48

How to build pipeline with grid search cv and early stopping method?

0 Answers0