How is scikit-learn cross_val_predict accuracy score calculated?

Question

Does the cross_val_predict (see doc, v0.18) with k-fold method as shown in the code below calculate accuracy for each fold and average them finally or not?

cv = KFold(len(labels), n_folds=20)
clf = SVC()
ypred = cross_val_predict(clf, td, labels, cv=cv)
accuracy = accuracy_score(labels, ypred)
print accuracy

Omid · Accepted Answer · 2019-04-19T19:40:03.373

No, it does not!

According to cross validation doc page, cross_val_predict does not return any scores but only the labels based on a certain strategy which is described here:

The function cross_val_predict has a similar interface to cross_val_score, but returns, for each element in the input, the prediction that was obtained for that element when it was in the test set. Only cross-validation strategies that assign all elements to a test set exactly once can be used (otherwise, an exception is raised).

And therefore by calling accuracy_score(labels, ypred) you are just calculating accuracy scores of labels predicted by aforementioned particular strategy compared to the true labels. This again is specified in the same documentation page:

These prediction can then be used to evaluate the classifier:
predicted = cross_val_predict(clf, iris.data, iris.target, cv=10) 
metrics.accuracy_score(iris.target, predicted)
Note that the result of this computation may be slightly different from those obtained using cross_val_score as the elements are grouped in different ways.

If you need accuracy scores of different folds you should try:

>>> scores = cross_val_score(clf, X, y, cv=cv)
>>> scores                                              
array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])

and then for the mean accuracy of all folds use scores.mean():

>>> print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))
Accuracy: 0.98 (+/- 0.03)

How to calculate Cohen kappa coefficient and confusion matrix for each fold?

For calculating Cohen Kappa coefficient and confusion matrix I assumed you mean kappa coefficient and confusion matrix between true labels and each fold's predicted labels:

from sklearn.model_selection import KFold
from sklearn.svm.classes import SVC
from sklearn.metrics.classification import cohen_kappa_score
from sklearn.metrics import confusion_matrix

cv = KFold(len(labels), n_folds=20)
clf = SVC()
for train_index, test_index in cv.split(X):
    clf.fit(X[train_index], labels[train_index])
    ypred = clf.predict(X[test_index])
    kappa_score = cohen_kappa_score(labels[test_index], ypred)
    confusion_matrix = confusion_matrix(labels[test_index], ypred)

What does `cross_val_predict` return?

It uses KFold to split the data to k parts and then for i=1..k iterations:

takes i'th part as the test data and all other parts as training data
trains the model with training data (all parts except i'th)
then by using this trained model, predicts labels for i'th part (test data)

In each iteration, label of i'th part of data gets predicted. In the end cross_val_predict merges all partially predicted labels and returns them as the final result.

This code shows this process step by step:

X = np.array([[0], [1], [2], [3], [4], [5]])
labels = np.array(['a', 'a', 'a', 'b', 'b', 'b'])

cv = KFold(len(labels), n_folds=3)
clf = SVC()
ypred_all = np.chararray((labels.shape))
i = 1
for train_index, test_index in cv.split(X):
    print("iteration", i, ":")
    print("train indices:", train_index)
    print("train data:", X[train_index])
    print("test indices:", test_index)
    print("test data:", X[test_index])
    clf.fit(X[train_index], labels[train_index])
    ypred = clf.predict(X[test_index])
    print("predicted labels for data of indices", test_index, "are:", ypred)
    ypred_all[test_index] = ypred
    print("merged predicted labels:", ypred_all)
    i = i+1
    print("=====================================")
y_cross_val_predict = cross_val_predict(clf, X, labels, cv=cv)
print("predicted labels by cross_val_predict:", y_cross_val_predict)

The result is:

iteration 1 :
train indices: [2 3 4 5]
train data: [[2] [3] [4] [5]]
test indices: [0 1]
test data: [[0] [1]]
predicted labels for data of indices [0 1] are: ['b' 'b']
merged predicted labels: ['b' 'b' '' '' '' '']
=====================================
iteration 2 :
train indices: [0 1 4 5]
train data: [[0] [1] [4] [5]]
test indices: [2 3]
test data: [[2] [3]]
predicted labels for data of indices [2 3] are: ['a' 'b']
merged predicted labels: ['b' 'b' 'a' 'b' '' '']
=====================================
iteration 3 :
train indices: [0 1 2 3]
train data: [[0] [1] [2] [3]]
test indices: [4 5]
test data: [[4] [5]]
predicted labels for data of indices [4 5] are: ['a' 'a']
merged predicted labels: ['b' 'b' 'a' 'b' 'a' 'a']
=====================================
predicted labels by cross_val_predict: ['b' 'b' 'a' 'b' 'a' 'a']

Hi, thanks. I got how to calculate `cross_val_score` and average for each fold. Similarly, could you show me how to calculate `Cohen kappa coefficient` and `confusion matrix` for each fold and then average? — Roman, Jan 08 '17 at 01:40
HI. See my update for Cohen kappa coefficient and confusion matrix. What do you mean by `then average`? — Omid, Jan 08 '17 at 02:21
Hi thanks again, I got your edit and understood the matter. I have a last confusion... In my question, `ypred = cross_val_predict(clf, td, labels, cv=cv)` could you explain me how the `ypred` was calculated using layman's language... — Roman, Jan 08 '17 at 02:23
KFold splits the data to k parts and then for i=1..k iterations does this: takes all parts except i'th part as the training data, fits the model with them and then predicts labels for i'th part (test data). In each iteration, label of i'th part of data gets predicted. In the end `cross_val_predict` merges all partially predicted labels and returns them as a whole. — Omid, Jan 08 '17 at 02:30
Still difficult to understand. Could you show it in the similar way as you explained before using EDIT... — Roman, Jan 08 '17 at 02:35
When would one want to use cross_val_predict? cross_val_score already returns the scores of each fold. I would hesitate to use cross_val_predict to output predicted values for the entire period the data covers, say to visualise the predicted values against the true values, since hyperparameters C and gamma have not been tuned via cross validation here. Maybe best to split the data into a training and test set, tune the hyperparameters in the test set with cross validation, and then evaluate the final model with the test set, as per this: https://sadanand-singh.github.io/posts/svmpython/ — Oliver Angelil, Dec 16 '17 at 04:18
@Omid Thank you for the wonderful answer. I am using the code you have provided in the answer. However, I encountered an issue while running this code, where I posted a question that can be found in: https://stackoverflow.com/questions/58895897/how-to-write-cross-validation-in-python Please let me know your thoughts on it. Thank you. Looking forward to hearing from you :) — EmJ, Nov 16 '19 at 23:21
@EmJ I'm now briefly accessing the internet after a week of ongoing total and nation-wide internet outage caused by the government of Iran. Sorry to see your question is gone, I hope you've found a solution to your problem by now. P.S.: I probably won't be available for an unknown amount of time again. — Omid, Nov 23 '19 at 04:54
How are the two methods doing something different? They both train and test on the k-1 folds in the same way. — Helen, Jun 23 '22 at 05:13

score 8 · Answer 2 · answered Jan 06 '17 at 16:09

As you can see from the code of cross_val_predict on github, the function computes for each fold the predictions and concatenates them. The predictions are made based on model learned from other folds.

Here is a combination of your code and the example provided in the code

from sklearn import datasets, linear_model
from sklearn.model_selection import cross_val_predict, KFold
from sklearn.metrics import accuracy_score

diabetes = datasets.load_diabetes()
X = diabetes.data[:400]
y = diabetes.target[:400]
cv = KFold(n_splits=20)
lasso = linear_model.Lasso()
y_pred = cross_val_predict(lasso, X, y, cv=cv)
accuracy = accuracy_score(y_pred.astype(int), y.astype(int))

print(accuracy)
# >>> 0.0075

Finally, to answer your question: "No, the accuracy is not averaged for each fold"

`the function computes for each fold the predictions and concatenates them.` What do you mean by `concatenates`? What is the retrieved accuracy mean? Seems it mess up everything. How can I calculate accuracy by averaging for each fold? — Roman, Jan 07 '17 at 04:49

score 2 · Answer 3 · answered Oct 09 '18 at 15:33

2

As it is written in the documenattion sklearn.model_selection.cross_val_predict :

It is not appropriate to pass these predictions into an evaluation metric. Use cross_validate to measure generalization error.

answered Oct 09 '18 at 15:33

hey_rey

103
8

5

Why is that true though? What is the difference between using cross_val_predict and cross_validate making only the latter suitable for evaluation? – Femkemilene Jun 10 '19 at 14:25

score 0 · Answer 4 · answered Jun 10 '18 at 11:37

I would like to add an option for a quick and easy answer, above what the previous developers contributed.

If you take micro average of F1 you will essentially be getting the accuracy rate. So for example that would be:

from sklearn.model_selection import cross_val_score, cross_val_predict
from sklearn.metrics import precision_recall_fscore_support as score    

y_pred = cross_val_predict(lm,df,y,cv=5)
precision, recall, fscore, support = score(y, y_pred, average='micro') 
print(fscore)

This works mathematically, since the micro average gives you the weighted average of the confusion matrix.

Good luck.

How is scikit-learn cross_val_predict accuracy score calculated?

4 Answers4

How to calculate Cohen kappa coefficient and confusion matrix for each fold?

What does `cross_val_predict` return?

Linked

How is scikit-learn cross_val_predict accuracy score calculated?

4 Answers4

How to calculate Cohen kappa coefficient and confusion matrix for each fold?

What does cross_val_predict return?

Linked

What does `cross_val_predict` return?