I have three binary classification models and I arrived up to the following point trying to assemble them into a final comparative ROC plot.
import pandas as pd
import numpy as np
import sklearn.metrics as metrics
y_test = ... # a numpy array containing the test values
dfo = ... # a pd.DataFrame containing the model predictions
dfroc = dfo[['SVM',
'RF',
'NN']].apply(lambda y_pred: metrics.roc_curve(y_test[:-1], y_pred[:-1])[0:2],
axis=0, result_type='reduce')
print(dfroc)
dfroc_auc = dfroc.apply(lambda x: metrics.auc(x[0], x[1]))
print(dfroc_auc)
Which outputs the following (where dfroc
and dfroc_auc
are of type pandas.core.series.Series
):
SVM ([0.0, 0.016666666666666666, 1.0], [0.0, 0.923...
RF ([0.0, 0.058333333333333334, 1.0], [0.0, 0.769...
NN ([0.0, 0.06666666666666667, 1.0], [0.0, 1.0, 1...
dtype: object
SVM 0.953205
RF 0.855449
NN 0.966667
dtype: float64
To be able to plot them as a comparative ROC I'd need to convert these into the following pivoted structure as dfroc
pd.DataFrame
... how can this pivotization be done?
model fpr tpr
1 SVM 0.0 0.0
2 SVM 0.16666 0.923
3 SVM 1.0 ...
4 RF 0.0 0.0
5 RF 0.05833 0.769
6 RF 1.0 ...
7 NN ... ...
And then for the plotting and following directions from How to plot ROC curve in Python would be something like:
import matplotlib.pyplot as plt
plt.title('Receiver Operating Characteristic')
dfroc.plot(label = 'AUC = %0.2f' % roc_auc)
plt.legend(loc = 'lower right')
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()