1

the output has four classes: [0,1,2,3] the prediction is continuous number in [0,1] (after using sigmoid function)

I have tried confusion matrix, f1_score in sklearn, but there is an error in both case:

ValueError: Can't handle mix of multiclass and continuous

If I reduce it into binary classifier and use AUC to evaluate it, there is no error, which means that AUC can handle continuous inputs.

My question is where can I find an evaluation in sklearn so that not only deal with multi-classes but also handle with contiuous inputs.

Alex
  • 601
  • 8
  • 22
  • Just to make it clear - you are trying to classify to one of 4 classes, your prediction is a number in the range [0,1] for each class? Assuming that you get this output - how do you choose the classified class? – ginge Jan 22 '17 at 17:14
  • good question. first,they among [0,1] because of activation i choose is sigmoid function. – Alex Jan 23 '17 at 03:30
  • then, draw the output distribution and select three thresholds – Alex Jan 23 '17 at 03:32
  • so you get a 1X4 vector of [0,1] float numbers, select 3 thresholds (how?) and then what? how do you use the thresholds? – ginge Jan 23 '17 at 13:02
  • I get a 1*n (size of dataset) array of [0,1]. draw them in x-y. x is their value, y is how many items corresponding to specific x. then pick four peaks(central limit theorem) and find three thresholds between each peak.(maybe in the middle of each peak). i use thresholds to predict. – Alex Jan 24 '17 at 16:43
  • pseudocode: if(predict> threshold[0] && predict<=thresholds[1]) predict = class0 else if(predict> threshold[1] && predict<=thresholds[2]) predict = class1..... – Alex Jan 24 '17 at 16:52
  • I am sorry to nitpick, but your final prediction is categorical (0, 1, 2, 3) and not continous, your predictions rely *on* continous values (specifically the thresholds), so what exactly are you trying to evalute - the AUC of your predictions or the AUC of your thresholds based on previous operations? – ginge Jan 25 '17 at 08:37
  • evaluate the auc of predictions – Alex Jan 25 '17 at 16:27

1 Answers1

2

Before dealing with the particulars of your problem you need to make sure you understand the AUC metric and how to use it properly.

To understand what the AUC metric mean you can start here.

In essence you want to get a list of predictions based on different thresholds (i.e. move them around and get predictions every time), calculate your false positive and false negative rates for each instance of thresholds and then calculate your AUC over them.

Calculating and evaluating multi-class AUC is not straight-forward. You can find more information here, but I attach below a good code snippet to get you started.

# Compute macro-average ROC curve and ROC area

# First aggregate all false positive rates, 
# assuming fpr is a list of false positive values per class
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))

# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])

# Finally average it and compute AUC
mean_tpr /= n_classes

fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

# Plot all ROC curves
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
     label='micro-average ROC curve (area = {0:0.2f})'
           ''.format(roc_auc["micro"]),
     color='deeppink', linestyle=':', linewidth=4)

plt.plot(fpr["macro"], tpr["macro"],
     label='macro-average ROC curve (area = {0:0.2f})'
           ''.format(roc_auc["macro"]),
     color='navy', linestyle=':', linewidth=4)

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
         label='ROC curve of class {0} (area = {1:0.2f})'
         ''.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver operating characteristic to multi-class')
plt.legend(loc="lower right")
plt.show()
Community
  • 1
  • 1
ginge
  • 1,962
  • 16
  • 23