FInding optimal threshold from the data in binary classification

Question

I have a classifier that outputs a proportion X between 0 and 1. I also have an associated ground truth which is the real proportion. I want to predict 1 when the output of the classifier is greater than some threshold and 0 otherwise . From data visualization I know that a good threshold is around 0.5.

How can I estimate the best threshold from the data ?

Here is an example of my data

predicted = [0.13675214 0.31400966 0.28037383 0.18337408 0.10043668 0.6
 0.74242424 0.30853994 0.30588235 0.24766355 0.19806763 0.20512821
 0.29752066 0.23504274 0.14133333 0.52733119 0.46039604 0.56306306
 0.29059829 0.02890173 0.2962963  0.47008547 0.54545455 0.58119658
 0.3        0.66242038 0.42066421] 

ground_truth = [0.11111111 0.647343   0.21028037 0.20293399 0.         0.93333333
 1.         0.07162534 0.61176471 0.21028037 0.647343   0.11111111
 0.07162534 0.5        0.08       0.88424437 0.58415842 0.74774775
 0.11111111 0.03468208 0.         0.5        0.         0.91168091
 1.         0.96178344 0.10701107]

desired_output = [0,1,0,0,0,1,1,0,1,0,1,0,0,0,0,1,1,1,0,0,0,1,0,1,1,1,0]

Thank you

[This](https://stackoverflow.com/questions/28719067/roc-curve-and-cut-off-point-python) or [this](https://stats.stackexchange.com/questions/29719/how-to-determine-best-cutoff-point-and-its-confidence-interval-using-roc-curve-i) discussion would be worth looking. — null, Nov 08 '20 at 21:28

Ozer Ozdal · Answer 1 · 2022-10-20T14:04:18.700

Precision, recall and f1-score values depend on the probability threshold. Changes in the threshold that we select to use as a cut-off to determine that a sample belongs to the positive class will affect the precision, recall and therefore f1-score. I share my attempt to plot precision, recall and f1-score depending on discrimination threshold. The plot also determines the optimal threshold for the dataset and the model to classify a sample as a member of the positive class. The optimal threshold is that at which f1-score is highest by default.

import pandas as pd
import pathlib
import matplotlib.pyplot as plt
from matplotlib.ticker import (MultipleLocator, AutoMinorLocator)
from sklearn.metrics import confusion_matrix as cm_sklearn
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score

def plot_discrimination_threshold(clf, X_test, y_test, argmax='f1', title='Metrics vs Discriminant Threshold', fig_size=(10, 8), dpi=100, save_fig_path=None):
    """
    Plot precision, recall and f1-score vs discriminant threshold for the given pipeline model
    Parameters
    ----------
    clf : estimator instance (either sklearn.Pipeline, imblearn.Pipeline or a classifier)
        PRE-FITTED classifier or a PRE-FITTED Pipeline in which the last estimator is a classifier.
    X_test : pandas.DataFrame of shape (n_samples, n_features)
        Test features.
    y_test : pandas.Series of shape (n_samples,)
        Target values.
    argmax : str, default: 'f1'
        Annotate the threshold maximized by the supplied metric. Options: 'f1', 'precision', 'recall'
    title : str, default ='FPR and FNR vs Discriminant Threshold'
        Plot title.
    fig_size : tuple, default = (10, 8)
        Size (inches) of the plot.
    dpi : int, default = 100
        Image DPI.
    save_fig_path : str, defaut=None
        Full path where to save the plot. Will generate the folders if they don't exist already.
    Returns
    -------
        fig : Matplotlib.pyplot.Figure
            Figure from matplotlib
        ax : Matplotlib.pyplot.Axe
            Axe object from matplotlib
    """
        
    thresholds = np.linspace(0, 1, 100)
    
    precision_ls = []
    recall_ls = []
    f1_ls = []
    fpr_ls = []
    fnr_ls = []
    
    # obtain probabilities
    probs = clf.predict_proba(X_test)[:,1]

    for threshold in thresholds:   
    
        # obtain class prediction based on threshold
        y_predictions = np.where(probs>=threshold, 1, 0) 
        
        # obtain confusion matrix
        tn, fp, fn, tp = cm_sklearn(y_test, y_predictions).ravel()
        
        # obtain FRP and FNR
        FPR = fp / (tn + fp)
        FNR = fn / (tp + fn)
        
        # obtain precision, recall and f1 scores
        precision = precision_score(y_test, y_predictions, average='binary')
        recall = recall_score(y_test, y_predictions, average='binary')
        f1 = f1_score(y_test, y_predictions, average='binary')
         
        precision_ls.append(precision)
        recall_ls.append(recall)
        f1_ls.append(f1)
        fpr_ls.append(FPR)
        fnr_ls.append(FNR)
              
    metrics = pd.concat([
        pd.Series(precision_ls),
        pd.Series(recall_ls),
        pd.Series(f1_ls),
        pd.Series(fpr_ls),
        pd.Series(fnr_ls)], axis=1)

    metrics.columns = ['precision', 'recall', 'f1', 'fpr', 'fnr']
    metrics.index = thresholds
    
    plt.rcParams["figure.facecolor"] = 'white'
    plt.rcParams["axes.facecolor"] = 'white'
    plt.rcParams["savefig.facecolor"] = 'white'
                
    fig, ax = plt.subplots(1, 1, figsize=fig_size, dpi=dpi)
    ax.plot(metrics['precision'], label='Precision')
    ax.plot(metrics['recall'], label='Recall')
    ax.plot(metrics['f1'], label='f1')
    ax.plot(metrics['fpr'], label='False Positive Rate (FPR)', linestyle='dotted')
    ax.plot(metrics['fnr'], label='False Negative Rate (FNR)', linestyle='dotted')
    
    # Draw a threshold line
    disc_threshold = round(metrics[argmax].idxmax(), 2)
    ax.axvline(x=metrics[argmax].idxmax(), color='black', linestyle='dashed', label="$t_r$="+str(disc_threshold))

    ax.xaxis.set_major_locator(MultipleLocator(0.1))
    ax.xaxis.set_major_formatter('{x:.1f}')
    
    ax.yaxis.set_major_locator(MultipleLocator(0.1))
    ax.yaxis.set_major_formatter('{x:.1f}')

    ax.xaxis.set_minor_locator(MultipleLocator(0.05))    
    ax.yaxis.set_minor_locator(MultipleLocator(0.05))    

    ax.tick_params(which='both', width=2)
    ax.tick_params(which='major', length=7)
    ax.tick_params(which='minor', length=4, color='black') 
    
    plt.grid(True)
    
    plt.xlabel('Probability Threshold', fontsize=18)
    plt.ylabel('Scores', fontsize=18)
    plt.title(title, fontsize=18)
    leg = ax.legend(loc='best', frameon=True, framealpha=0.7)
    leg_frame = leg.get_frame()
    leg_frame.set_color('gold')
    plt.show()

    if (save_fig_path != None):
        path = pathlib.Path(save_fig_path)
        path.parent.mkdir(parents=True, exist_ok=True)
        fig.savefig(save_fig_path, dpi=dpi)

    return fig, ax, disc_threshold

score 0 · Answer 2 · answered Nov 09 '20 at 09:50

0

You seem to have a native 90% accuracy

delt = predicted - ground_truth    # where all but 2 of 20 appear within .4

Other/ more examples of (model) predicted would illustrate ranges perhaps?

answered Nov 09 '20 at 09:50

Stephenbnagy

1

FInding optimal threshold from the data in binary classification

2 Answers2