1

I pass y_true and y_pred of shapes (999,) with labels 0 and 1 to FP, TP, threshold = roc_curve(y_true, y_pred, pos_label=1) and as a result array of only 3 elements is being returned. What can be wrong?

The whole code snippet

def get_roc_auc_scores(y_pred_labels, y_true_labels):
roc_auc_scores = {key: {'FP': [], 'TP': [], 'Scores': []} for key in ['Low', 'High']}
for key in roc_auc_scores.keys():
    for y_pred in y_pred_labels:
        # Get True Positive and False Positive labels
        fp, tp, thresh = roc_curve(y_true_labels[key], y_pred, pos_label=1)
        roc_auc_scores[key]['FP'].append(fp)
        roc_auc_scores[key]['TP'].append(tp)
        
        # Get AUC score
        auc_score = roc_auc_score(y_true_labels[key], y_pred)
        roc_auc_scores[key]['Scores'].append(auc_score)
return roc_auc_scores

where y_pred_labels is a list containing 6 arrays with predictions and y_true_labels contains true labels. High and Low describe only the specific case.

Beso
  • 1,176
  • 5
  • 12
  • 26
Filip Szczybura
  • 407
  • 5
  • 14
  • What does it happen if you explicitly set `drop_intermediate=False` (default to True) in the call to `roc_curve()`? See [here](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html) for reference. – amiola Dec 19 '20 at 21:27
  • Unfortunately it does not help. Even such basic code: y_true = [1,1,1,1,1] y_pred = [0,1,1,0,0] fp, tp, _ = roc_curve(y_true, y_pred, pos_label=1, drop_intermediate=False) provides output stray of size (3,). Maybe my assumption of the same input output sizes is bad – Filip Szczybura Dec 19 '20 at 23:25
  • I suggest you to check out [this thread](https://stackoverflow.com/questions/31129592/how-to-get-a-classifiers-confidence-score-for-a-prediction-in-sklearn). It helped me a lot. – andexte May 06 '22 at 08:39

1 Answers1

0

ROC curves are built by varying a decision threshold: you should not pass hard class predictions as y_score, but instead the probability scores or some other confidence measure.

From the documentation:

y_score : array, shape = [n_samples]

Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers).

Passing a hard binary classification, there are only three relevant thresholds: one below 0, one between 0 and 1, and one greater than 1.

Ben Reiniger
  • 10,517
  • 3
  • 16
  • 29