1

I have checked all SO question which generate confusion matrix and calculate TP, TN, FP, FN.

Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

Mainly it usage

from sklearn.metrics import confusion_matrix

For two class it's easy

from sklearn.metrics import confusion_matrix

y_true = [1, 1, 0, 0]
y_pred = [1, 0, 1, 0]   

tn, fp, fn, tp = confusion_matrix(y_true, y_pred, labels=[0, 1]).ravel()

For multiclass there is one solution, but it does it only for first class. Not all class

def perf_measure(y_actual, y_pred):
    class_id = set(y_actual).union(set(y_pred))
    TP = []
    FP = []
    TN = []
    FN = []

    for index ,_id in enumerate(class_id):
        TP.append(0)
        FP.append(0)
        TN.append(0)
        FN.append(0)
        for i in range(len(y_pred)):
            if y_actual[i] == y_pred[i] == _id:
                TP[index] += 1
            if y_pred[i] == _id and y_actual[i] != y_pred[i]:
                FP[index] += 1
            if y_actual[i] == y_pred[i] != _id:
                TN[index] += 1
            if y_pred[i] != _id and y_actual[i] != y_pred[i]:
                FN[index] += 1

    return class_id,TP, FP, TN, FN

But this by default calculate for only one class.

But I want to calculate the values for each class given a 4 class. For https://extendsclass.com/csv-editor.html#0697f61

I have done it using excel like this.

enter image description here

Then calculate the results for each

enter image description here

I have automated it in Excel sheet, but is there any programatical solution in python or sklearn to do this ?

user2129623
  • 2,167
  • 3
  • 35
  • 64

1 Answers1

1

This is way easier with multilabel_confusion_matrix. For your example, you can also pass labels=["A", "N", "O", "~"] as an argument to get the labels in the preferred order.

from sklearn.metrics import multilabel_confusion_matrix
import numpy as np

mcm = multilabel_confusion_matrix(y_true, y_pred)

tps = mcm[:, 1, 1]
tns = mcm[:, 0, 0]

recall      = tps / (tps + mcm[:, 1, 0])         # Sensitivity
specificity = tns / (tns + mcm[:, 0, 1])         # Specificity
precision   = tps / (tps + mcm[:, 0, 1])         # PPV

Which results in an array for each metric:

[[0.83333333 0.94285714 0.64       0.25      ]   # Sensitivity / Recall
 [0.99029126 0.74509804 0.91666667 1.        ]   # Specificity
 [0.9375     0.83544304 0.66666667 1.        ]]  # Precision / PPV

Alternatively, you may view class-dependent precision and recall in classification_report. You could get the same lists with output_dict=True and each class label.

>>> print(classification_report(y_true, y_pred))
              precision    recall  f1-score   support

           A       0.94      0.83      0.88        18
           N       0.84      0.94      0.89        70
           O       0.67      0.64      0.65        25
           ~       1.00      0.25      0.40         8

    accuracy                           0.82       121
   macro avg       0.86      0.67      0.71       121
weighted avg       0.83      0.82      0.81       121
Alexander L. Hayes
  • 3,892
  • 4
  • 13
  • 34
  • Can you please include how we can see sensitivity, specificity, pos pred for each class separately ? – user2129623 Dec 29 '22 at 13:35
  • Sorry, I don't understand. `specificity = tns / (tns + mcm[:, 0, 1])` gives the specificity for each class. If you want one class, you can select it with `specificity[0]`, `specificity[1]`, etc. – Alexander L. Hayes Dec 29 '22 at 15:05