-1

I am classifying certain objects into 5 classes with labels [0,1,2,3,4], by human.

A set of true labels: true_label = [3, 4, 2, 1, 0, 2 ............, 3]

A set of predicted labels: predictions = [3, 4, 2, 2, 0, 2, ........., 3]

How do I plot a ROC curve with such hard class predictions? Plotting ROC curve (with sklearn API), seems to require predictions in terms of probabilities, but there are no such probabilities with categorical prediction by human. A human cannot give a 'probability' for certain prediction, he/she just thinks the object is 2, but not 2 with 93% probability.

How do I plot ROC curve with the numpy list true_label and predictions above?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Maximus
  • 75
  • 8
  • I’m voting to close this question because it is not about programming as defined in the [help], and it stems from a fundamental misunderstanding of the relevant underlying theory. – desertnaut Mar 28 '21 at 15:32
  • @desertnaut Are you saying we can't plot ROC curve with labels above? Then how did authors plot ROC curve for human accessors, as in figures 2 and 3, in this paper: https://www.nature.com/articles/s41586-019-1799-6 – Maximus Mar 28 '21 at 15:39
  • The article you have linked to is behind a paywall, so I cannot view it and comment... – desertnaut Mar 28 '21 at 15:41
  • That said, notice the expression "*the AUC-ROC for the **average** radiologist*" in the abstract; my guess is - they *averaged* the 0/1 classifications provided by *multiple* human radiologists, and then treated this average value as the probability output of the (imaginary) "average radiologist". Not at all what you ask here. – desertnaut Mar 28 '21 at 15:52
  • You can click into "Figures" on the right side to see the figures without paying for the article. In particular, the "extended Data Fig. 2", which can be seen in higher resolution without payment, shows ROC curve for each individual radiologist. How can that be done without "probabilities" given by the radiologists? – Maximus Mar 28 '21 at 16:06
  • We can choose between the fundamental theory of ROC and a figure caption without the exact context (it is not even clear what the mentioned "Readers" are exactly); I choose the former, and I kindly suggest you do so, too. – desertnaut Mar 28 '21 at 22:25
  • In the blurry free images, the readers appear as individual points on the plot, presumably using their sensitivity and 1-specificity as coordinates for a point estimate, whereas the 'AI system' appears to be the only curve represented. – Brian D Mar 30 '21 at 04:46

1 Answers1

1

You cannot plot a ROC curve using predicted labels.

As with any ROC curve function, sklearn's roc_curve() is designed to receive an array of true labels and an array of probabilities.

You can find more detailed answers in this question, but in essence, the function uses each predicted probability as a threshold to yield one array of predicted labels. In turn, each threshold yields a true positive rate and a false positive rate. Repeating this process for each element in the array of predicted probabilities results in a ROC curve.

If you only have the predicted labels, I suggest you measure the accuracy, true positive rate, false positive rate, etc.

from sklearn.metrics import confusion_matrix

confusion_matrix(y_true=true_label, y_pred=predictions)
Arturo Sbr
  • 5,567
  • 4
  • 38
  • 76