0

Here is my code. It start with pipeline (standardizing,replace null value,onehotencoding and selectkbest) with lightgbm model to fit my data.

numeric_features = ['X10','X11', 'X12', 'X13', 'X14']
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])

categorical_features = ['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9']
categorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='constant', fill_value='FLAG_NAN')),('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_features),('cat', categorical_transformer, categorical_features)])

pipe = Pipeline(steps=[('preprocessor', preprocessor),('selector', SelectKBest(mutual_info_classif, k=5)),('classifier',LGBMClassifier())])
   
search_space = dict(classifier =[LGBMClassifier()])

X_train = train.drop(columns=['Y'])
X_test   = test.drop(columns=['Y'])
y_train = train['Y']
y_test  = test['Y'] 

grid_search_pipe = 
GridSearchCV(estimator=pipe,param_grid=search_space,scoring="neg_mean_squared_error",cv=5)

grid_search_pipe.fit(X_train, y_train, classifier__early_stopping_rounds=10, classifier__eval_metric="rmse", classifier__eval_set=[[X_test, y_test]])

And I got this error

ValueError: DataFrame.dtypes for data must be int, float or bool.
Did not expect the data types in the following fields: X1, X2, X3, X4, X5, X6, X7, X8, X9

My data has some categorical column.

Tanakorn Taweepoka
  • 197
  • 2
  • 3
  • 14
  • 1
    Irrelevant to your issue, but before combining CV with early stopping, you may want to have a look here: [Early stopping with Keras and sklearn GridSearchCV cross-validation](https://stackoverflow.com/questions/48127550/early-stopping-with-keras-and-sklearn-gridsearchcv-cross-validation/48139341#48139341) – desertnaut Aug 27 '20 at 09:34
  • I have another question after reading your answer. In my code, which data early stopping apply on between test data and validation data in gridsearchcv? – Tanakorn Taweepoka Aug 28 '20 at 04:48

0 Answers0