3

How to plot the precision and recall curves of a CNN? I have generated the scores from CNN and want to plot the precision-recall curve, but I am unable to get that. I have calculated TP, TN, FP, and FN using:

idx = (ACTUAL()==1);
p = length(ACTUAL(idx));  
n = length(ACTUAL(~idx));   
N = p+n;
tp = sum(ACTUAL(idx)==PREDICTED(idx));   
tn = sum(ACTUAL(~idx)==PREDICTED(~idx));   
fp = n-tn;   
fn = p-tp;

The formula of precision and recall is

precision = tp/(tp+fp)

but with that, I am getting some undesired plot.

I have obtained scores of the CNN using the following command:

[YTest,score]=classify(convnet,TestData)
hbaderts
  • 14,136
  • 4
  • 41
  • 48
user123456789
  • 45
  • 2
  • 11

2 Answers2

3

MATLAB has a function for creating ROC curves and similar performance curves (such as precision-recall curves) in the Statistics and Machine Learning Toolbox: perfcurve. By default, the ROC curve is calculated. The function has the following syntax:

[X, Y] = perfcurve(labels, scores, posclass)

Here, labels is the true label for each sample, scores is the prediction of the CNN (or any other classifier), and posclass is the label of the class you assume to be "positive" - which appears to be 1 in your example. The outputs of the perfcurve function are the (x, y) coordinates of the ROC curve, so you can easily plot it using

plot(X, Y)

To make perfcurve plot the precision-recall curve instead of the ROC curve, you have to set the optional 'XCrit' and 'YCrit' arguments of the function. As described in the documentation, different pre-defined criteria such as number of false positives ('fp'), true positive rate ('tpr'), accuracy ('accu') and many more, or even custom functions can be used.

By setting 'XCrit' to 'tpr' (Recall) and 'YCrit' to 'prec' (Precision), a precision-recall curve is created:

[X, Y] = perfcurve(labels, scores, posclass, 'XCrit', 'tpr', 'YCrit', 'prec');
plot(X, Y);
xlabel('Recall')
ylabel('Precision')
xlim([0, 1])
ylim([0, 1])

For example (using randomly generated data and a SVM):

Sample precision recall curve

hbaderts
  • 14,136
  • 4
  • 41
  • 48
  • the predicted score of CNN is 1000*100 in size(100 classes and 10 sample from each class). so when trying to use perfcurve it is giving error " You must pass scores as a vector of floating-point values. " what I have done is done column-wise sum to create a score vector. am I doing it right or wrong? if wrong then please suggest me the suitable way. – user123456789 Apr 05 '18 at 09:43
  • You can only evaluate the precision and recall for binary classification. I would usually recommend plotting one curve for each class. For that, you would insert `score(:,k)` into `perfcurve`. If you need to aggregate them into one plot, there are multiple ways to, as described [in this answer](https://stackoverflow.com/a/39420025/4221706). The simplest way would be to create the plots for each class as shown above and average them. – hbaderts Apr 05 '18 at 09:54
0

The answer of hbaderts is correct but the end of the answer is wrong.

[X,Y] = perfcurve(labels,scores,posclass,'xCrit', 'fpr', 'yCrit', 'tpr');

Then the generated Receiver operating characteristic (ROC) curve is correct. enter image description here

PyMatFlow
  • 459
  • 4
  • 8