Do accuracy_score (from Scikit-learn) compute overall accuracy or mean accuracy?

Question

I'm trying to clarify something about accuracy in Python. I have 3 classes of Cancer and I'm trying to predict samples (patients) by their condition. I have followed this method, proposed by a guy always from stack overflow :

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python

Now I have done the exact same (only the part of sensitivity , specificity and accuracy were needed) :

 cnf_matrix = confusion_matrix(y_test, pred_y)

 FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)  
 FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
 TP = np.diag(cnf_matrix)
 TN = cnf_matrix.sum() - (FP + FN + TP)

 FP = FP.astype(float)
 FN = FN.astype(float)
 TP = TP.astype(float)
 TN = TN.astype(float)

 # Sensitivity, hit rate, recall, or true positive rate
 Sensitivity = TP/(TP+FN)
 # Specificity or true negative rate
 Specificity = TN/(TN+FP) 
 # Overall Accuracy (even if I dont think is overall)
 ACC = (TP+TN)/(TP+FP+FN+TN)

And as a result I get 3 lists (sensitivity, specificity and accuracy), but each of these lists contains 3 values ( I guess one per class ).

Sensitivity : [0.76999182 0.99404079 0.96377484]
Specificity : [0.98132687 0.97199254 0.9036957 ]
ACC         : [0.91487179 0.97717949 0.92794872]

But in the post the guy spoke about "overall accuracy", while instead I get the individual accuracy for each class (not bad tho). In fact when I use accuracy_score from Scikit-learn the final accuracy is different :

accuracy = accuracy_score(y_test,pred_y)
accuracy:  0.9099999999999991

I assume that using the guy technique I get an accuracy for each class and so I can compute the mean accuracy (that in this case is 0.9399999999999992) while Scikit-learn gives me the overall accuracy? I think that it is important to know what is what, because sometimes the difference is about 20%, and is a lot.

score 1 · Answer 1 · answered Oct 03 '22 at 13:53

The accuracy returned from from sklearn.metrics.accuracy_score is

(number of correctly predicted samples) / (total number of samples)

ie., accuracy.

What you're computing there is not accuracy for the entire dataset, it is the accuracy for the binary classification problem of each label, which you'll see listed here as accuracy for binary classification.

I haven't really ever seen that metric used, generally you'd pay attention to precision, recall, F1 score, and actual accuracy. Even if you wanted to use it, you should be careful when computing the mean: often there is a class imbalance in your data, so you might want to use a weighted mean.

The lists are the results from the computation of sensitivity, specificity and accuracy for each class using the confusion matrix. I did it by hand and the values are right. But for sure what you say seems right, I just did not get it completely (sorry). Are you saying that computing an overall accuracy is not possible and I should keep my 3 accuracy separated or vice versa, that computing for each class is dangerous and is better to just compute the accuracy with sklearn.metrics.accuracy_score ? — Looser, Oct 03 '22 at 14:06

Do accuracy_score (from Scikit-learn) compute overall accuracy or mean accuracy?

1 Answers1