Why is binary_crossentropy performing better than categorical_crossentropy for multiclass classification in Keras?

Question

I've seen many similar issues in stackoverflow but none of this refer to my case.

I have a multiclass classification problem and my labels are mutually exclusive.

Training with a binary_crossentropy due to a typo, resulted in lower loss and higher accuracy. What's interesting here is that, unlike other issues in stackoverflow, I am printing the "categorical_accuracy" of Keras. My labels are one-hot encoded.

So, to be exact my code looks like that:

net = Sequential() 
net.add(TimeDistributed(model_A, input_shape=(timesteps,960, 75, 1))) 
net.add(LSTM(100))
net.add(Dropout(0.5))
net.add(Dense(100, activation='relu'))
net.add(Dense(len(labels), activation='softmax'))

net.compile(loss='binary_crossentropy', optimizer=adam_opt, metrics=['binary_accuracy', 'categorical_accuracy'])

I also tried to train with "categorical_crossentropy", when I noticed the typo and the results where worse. How can this be explained?

possible duplicate https://stackoverflow.com/questions/42081257/keras-binary-crossentropy-vs-categorical-crossentropy-performance (you passed multiple metrics) — Frayal, Jan 11 '19 at 10:15
The weird in my case is that the reason of the better results is not the different metric but the different loss function. — Lara Larsen, Jan 11 '19 at 10:22
@LaraLarsen If you look at the keras tf backend for [binary_crossentropy](https://github.com/keras-team/keras/blob/master/keras/backend/tensorflow_backend.py#L3557) and [categorical_crossentropy](https://github.com/keras-team/keras/blob/master/keras/backend/tensorflow_backend.py#L3458), they call on different loss functions. In your current case, you have taken the softmax activation but passing it to sigmoid cross entropy. Refer to [this answer](https://stackoverflow.com/a/47238223/10111931) which explains why the result is not interpretable in that case. — kvish, Jan 11 '19 at 11:56
Possible duplicate of [Keras binary\_crossentropy vs categorical\_crossentropy performance?](https://stackoverflow.com/questions/42081257/keras-binary-crossentropy-vs-categorical-crossentropy-performance) — desertnaut, Jan 12 '19 at 00:22
@desertnaut I use categorical_accuracy, thus not the same issue. — Lara Larsen, Jan 15 '19 at 11:28

Why is binary_crossentropy performing better than categorical_crossentropy for multiclass classification in Keras?

0 Answers0