How to perform cross validation before a grid search scikit-learn python

Question

In the scikit-learn documentation example http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html a train_test_split is done before the grid search.

the grid search is then fit using the training sets and tested on the testing set from the train_test_split.

I wanted to know if it's possible and advisable to do a kfold cross validation inplace of the train_test_split so I could fit and test grid search on different data folds instead of just one train_test_split.(and consequently get the best score and parameters in this way)

That is called nested grid search with cross validation. You can look at [official documentation example](http://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html) and [my answer here](http://stackoverflow.com/a/42230764/3374996) for understanding it more. — Vivek Kumar, Mar 04 '17 at 03:06
So after we create the gridsearchCV object, is it fitting inside the cross_val_score function with the X_train and y_train found in every kfold iteration rather than the entire X and y? hopefully yes because that makes sense to me. — Ricky, Mar 04 '17 at 06:00
also would I be able to call best_estimator_ in this case. After doing what You have done? ..........clf = GridSearchCV(estimator=svr, param_grid=c_grid, cv=inner_cv) nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv).mean() — Ricky, Mar 04 '17 at 06:06
The step by step explanation in that answer is only for last two lines. — Vivek Kumar, Mar 06 '17 at 05:26

How to perform cross validation before a grid search scikit-learn python

0 Answers0