Let's say we tune an SVM with GridSearch like this:
algorithm = SVM()
parameters = {'kernel': ['rbf', 'sigmoid'], 'C': [0.1, 1, 10]}
grid= GridSearchCV(algorithm, parameters)
grid.fit(X, y)
You then wish to use the best fit parameters/estimator in a cross_val_score
. My question is, which model is grid
at this point? Is it the best performing one? In other words, can we just do
cross_val_scores = cross_val_score(grid, X=X, y=y)
or should we use
cross_val_scores = cross_val_score(grid.best_estimator_, X=X, y=y)
When I run both, I find that they do not return the same scores so I am curious what the correct approach is here. (I would assume using the best_estimator_
.) That raises another question, though, namely: what does using just grid
use as a model then? The first one?