0

I am training a binary classifier using Keras.

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=[auroc,'accuracy'])

I use a custom metric, AUROC.as in here

import tensorflow as tf
from sklearn.metrics import roc_auc_score  
def auroc(y_true, y_pred):
    return tf.py_func(roc_auc_score, (y_true, y_pred), tf.double)

So far, I have encoded my target as one-hot encoding and I had a last layer as

from keras.utils import to_categorical
y = to_categorical(y)
[...]
model.add(Dense(2, activation='sigmoid'))

I have learned that in principle here Keras binary_crossentropy vs categorical_crossentropy performance?I should not perform categorical encoding, and I should predict only one class using

# y = to_categorical(y)
[...]
model.add(Dense(1, activation='sigmoid'))

However, if I apply this and only this change my training auroc changes dramatically, from high 0.90s to 0.50. Even more strangely, val_auroc loss seems unaffected

How did that happen?

00__00__00
  • 4,834
  • 9
  • 41
  • 89
  • The most important question is: What type of label are you trying to predict? If you only have 2 classes, positive and negative, then binary should be used rather than categorical (as in [this](https://gombru.github.io/2018/05/23/cross_entropy_loss/) very good post), because binary is a special case of categorical with its own formula that assume only a pos/neg class – G. Anderson Oct 31 '18 at 16:50
  • it is indeed binary.that's why I have removed the categorical encoding. this motivates the rest of a question. why that caused some much changes?is it expected? – 00__00__00 Oct 31 '18 at 16:52
  • If you removed the categorical encoding, I'm assuming you also removed the one-hot encoding of your labels as It shouldn't be necessary. Regarding your training AUROC score, remember that if the area under your ROC curve is ~0.5, your model is unable to do any better than pure random guessing. Given this, are you sure that your `val_auroc` loss is being recalculated and not just re-used from an earlier run? – G. Anderson Oct 31 '18 at 16:57

0 Answers0