How is scikit-learn GridSearchCV best_score_ calculated?

Question

I've been trying to figure out how is the best_score_ parameter of GridSearchCV is being calculated (or in other words, what does it mean). The documentation says:

Score of best_estimator on the left out data.

So, I tried to translate it into something I understand and calculated the r2_score of the actual "y"s and the predicted ys of each kfold - and got different results (used this piece of code):

test_pred = np.zeros(y.shape) * np.nan 
for train_ind, test_ind in kfold:
    clf.best_estimator_.fit(X[train_ind, :], y[train_ind])
    test_pred[test_ind] = clf.best_estimator_.predict(X[test_ind])
r2_test = r2_score(y, test_pred)

I've searched everywhere for a more meaningful explanation of the best_score_ and couldn't find anything. Would anyone care to explain?

Thanks

It's usually the mean over folds. But it would be great if you could post your full code, e.g. on simulated data. — eickenberg, Jun 07 '14 at 10:37

Fred Foo · Accepted Answer · 2014-06-07T10:57:37.733

It's the mean cross-validation score of the best estimator. Let's make some data and fix the cross-validation's division of data.

>>> y = linspace(-5, 5, 200)
>>> X = (y + np.random.randn(200)).reshape(-1, 1)
>>> threefold = list(KFold(len(y)))

Now run cross_val_score and GridSearchCV, both with these fixed folds.

>>> cross_val_score(LinearRegression(), X, y, cv=threefold)
array([-0.86060164,  0.2035956 , -0.81309259])
>>> gs = GridSearchCV(LinearRegression(), {}, cv=threefold, verbose=3).fit(X, y) 
Fitting 3 folds for each of 1 candidates, totalling 3 fits
[CV]  ................................................................
[CV] ...................................... , score=-0.860602 -   0.0s
[Parallel(n_jobs=1)]: Done   1 jobs       | elapsed:    0.0s
[CV]  ................................................................
[CV] ....................................... , score=0.203596 -   0.0s
[CV]  ................................................................
[CV] ...................................... , score=-0.813093 -   0.0s
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.0s finished

Note the score=-0.860602, score=0.203596 and score=-0.813093 in the GridSearchCV output; exactly the values returned by cross_val_score.

Note that the "mean" is really a macro-average over the folds. The iid parameter to GridSearchCV can be used to get a micro-average over the samples instead.

Could you add the output for `gs.best_score_` and `cross_val_scores.mean()`? — eickenberg, Jun 07 '14 at 10:44
Indeed: `>>> gs.best_score_ -0.41004566175481089 >>> cross_val_score(LinearRegression(), X, y, cv=threefold).mean() -0.41073841862279581` — Korem, Jun 07 '14 at 10:55
I would appreciate further explanation of the cross_val_score - is it the r2_score for each kfold? — Korem, Jun 07 '14 at 10:57
@TalKremerman It's whatever `estimator.score(X[test_ind], y[test_ind])` returns, and for regression estimators, that's the R². — Fred Foo, Jun 07 '14 at 10:58

How is scikit-learn GridSearchCV best_score_ calculated?

1 Answers1

Linked