What is a threshold in a Precision-Recall curve?

Question

I am aware of the concept of Precision as well as the concept of Recall. But I am finding it very hard to understand the idea of a 'threshold' which makes any P-R curve possible.

Imagine I have a model to build that predicts the re-occurrence (yes or no) of cancer in patients using some decent classification algorithm on relevant features. I split my data for training and testing. Lets say I trained the model using the train data and got my Precision and Recall metrics using the test data.

But HOW can I draw a P-R curve now? On what basis? I just have two values, one precision and one recall. I read that its the 'Threshold' that allows you to get several precision-recall pairs. But what is that threshold? I am still a beginner and I am unable to comprehend the very concept of the threshold.

I see in so many classification model comparisons like the one below. But how do they get those many pairs?

Model Comparison Using Precision-Recall Curve

score 35 · Accepted Answer · edited Mar 17 '21 at 22:21

35

ROC Curves:

x-axis: False Positive Rate FPR = FP /(FP + TN) = FP / N

y-axis: True Positive Rate TPR = Recall = TP /(TP + FN) = TP / P

Precision-Recall Curves:

x-axis: Recall = TP / (TP + FN) = TP / P = TPR

y-axis: Precision = TP / (TP + FP) = TP / PP

Your cancer detection example is a binary classification problem. Your predictions are based on a probability. The probability of (not) having cancer.

In general, an instance would be classified as A, if P(A) > 0.5 (your threshold value). For this value, you get your Recall-Precision pair based on the True Positives, True Negatives, False Positives and False Negatives.

Now, as you change your 0.5 threshold, you get a different result (different pair). You can already classify a patient as 'has cancer' for P(A) > 0.3. This will decrease Precision and increase Recall. You would rather tell someone that he has cancer even though he has not, to make sure that patients with cancer are sure to get the treatment they need. This represents the intuitive trade-off between TPR and FPR or Precision and Recall or Sensitivity and Specificity.

Let's add these terms as you see them more often common in biostatistics.

Sensitivity = TP / P = Recall = TPR

Specificity = TN / N = (1 – FPR)

ROC-curves and Precision-Recall curves visualize all these possible thresholds of your classifier.

You should consider these metrics, if accuracy alone is not a suitable quality measure. Classifying all patients as 'does not have cancer' will give you the highest accuracy but the values of your ROC and Precision-Recall curves will be 1s and 0s.

edited Mar 17 '21 at 22:21

desertnaut

57,590
26
140
166

answered Sep 14 '17 at 17:39

lnathan

527
5
15

5

+1 for the clear explanation. However, I have a few questions, if I classify a patient as 'has cancer' for P(A) > 0.3, I am actually going to end up labeling many patients as 'Positive' for cancer, right? That means, the False Positives will be high, leading to low precision. Am I missing something here? – Mr.A Sep 14 '17 at 18:35
Okay before that, I assumed that when you go left to right in a precision-recall curve, your threshold increases. Is it a valid assumption? – Mr.A Sep 14 '17 at 19:00
1

Yes you are right, my mistake, I mixed that up. FP goes up -> Precision goes down. 2nd comment is also correct. :) – lnathan Sep 14 '17 at 20:02
You were right in your post. Lower the threshold - Higher the Precision. Its a paradox. When the threshold is low, we end up labeling many patients as Positive which will of course increase the number of False Positives but it will also increase the number of True Positives and specially when we have class imbalance (where more number of Positives are in the dataset than Negatives), we end up getting most of the Predictions right by sheer chance. Conclusion - FP increases but the increase in TP dominates FP so Precision increases when lower threshold is chosen. Correct me if I am wrong. – Mr.A Sep 17 '17 at 18:59
3

No, Recall will be high. Precision will be low as you have noticed in your first comment. – lnathan Sep 17 '17 at 19:48
@Inathan what about the fact that a given precision can be achieved at various thresholds and the recall at each could be different. How are these points plotted? – figs_and_nuts Sep 09 '18 at 23:16
@MiloMinderbinder https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html – lnathan Sep 10 '18 at 08:03
You would select the model with the highest Recall (with same Precision). – lnathan Sep 10 '18 at 08:04
What would be the range of thresholds that we can choose from? Is it [0, 1]? – Yanfeng Liu Oct 24 '18 at 20:24
number of thresholds is just the number of instances in your test set (confidence of predictions) – lnathan Oct 25 '18 at 10:07

score 1 · Answer 2 · answered Sep 02 '22 at 10:17

In addition to plotting, you can get an optimal threshold from the graphs using the below function:

from sklearn.metrics import precision_recall_curve
import numpy as np
   
def optimal_threshold_precision_recall_curve(gt, pmap):
        """Function to return an optimal thresholding value of the image
        """
        gt = gt.flatten()
        pmap = pmap.flatten()
        precision, recall, thresholds = precision_recall_curve(gt,pmap, pos_label=1)
        optimal_thresholds = sorted(list(zip(np.abs(precision - recall), thresholds)), key=lambda i: i[0], reverse=False)[0][1]
        optimal_mask = np.where(pmap>optimal_thresholds,1,0)
        return optimal_thresholds, optimal_mask

Note: The function takes the ground truth(gt) results together with the predicted probability maps (pmaps). I have flattened the inputs since the function accepts only 1D arrays. More details on the function and explanation can be found in this link.

score 0 · Answer 3 · edited Aug 28 '23 at 08:18

Adding to the answers, if you are plotting the curve, you can first get the probability by using predict_proba(X)

e.g.

>> clf = RandomForestClassifier(max_depth=2, random_state=0)
>> y_pred_prob = clf.predict_proba(X_test)

scikit-learn.org's reference on predict_proba()

then proceed to plot the curve with

from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

precision, recall, thresholds = precision_recall_curve(y_test, y_pred_proba[:,1])

plt.plot(thresholds, precision[:-1], label='Precision')
plt.plot(thresholds, recall[:-1], label='Recall')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.legend()
plt.title('Precision-Recall vs. Threshold Curve')
plt.grid(True)
plt.show()

What is a threshold in a Precision-Recall curve?

3 Answers3

Linked