I'm trying to clarify something about accuracy in Python. I have 3 classes of Cancer and I'm trying to predict samples (patients) by their condition. I have followed this method, proposed by a guy always from stack overflow :
True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python
Now I have done the exact same (only the part of sensitivity , specificity and accuracy were needed) :
cnf_matrix = confusion_matrix(y_test, pred_y)
FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)
FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)
# Sensitivity, hit rate, recall, or true positive rate
Sensitivity = TP/(TP+FN)
# Specificity or true negative rate
Specificity = TN/(TN+FP)
# Overall Accuracy (even if I dont think is overall)
ACC = (TP+TN)/(TP+FP+FN+TN)
And as a result I get 3 lists (sensitivity, specificity and accuracy), but each of these lists contains 3 values ( I guess one per class ).
Sensitivity : [0.76999182 0.99404079 0.96377484]
Specificity : [0.98132687 0.97199254 0.9036957 ]
ACC : [0.91487179 0.97717949 0.92794872]
But in the post the guy spoke about "overall accuracy", while instead I get the individual accuracy for each class (not bad tho). In fact when I use accuracy_score from Scikit-learn the final accuracy is different :
accuracy = accuracy_score(y_test,pred_y)
accuracy: 0.9099999999999991
I assume that using the guy technique I get an accuracy for each class and so I can compute the mean accuracy (that in this case is 0.9399999999999992) while Scikit-learn gives me the overall accuracy? I think that it is important to know what is what, because sometimes the difference is about 20%, and is a lot.