Classification for a case with multiple labels per data point

Question

In classification problems in machine learning, typically we use a single label for a single data point. How can we go ahead with multiple labels for a single data point?

As an example, suppose a character recognition problem. As the labels for a single image of a letter, we have the encoded values for both the letter and the font family. Then there are two labels per data point.

How can we make a keras deep learning model for this? Which hyperparameters should be changed compared with a single labelled problem?

This is an already established sub-field called *multi-label classification*. — desertnaut, Aug 19 '20 at 10:12

score 1 · Answer 1 · answered Aug 19 '20 at 09:36

In short, you let the model output two predictions.

        ...
previous-to-last layer
     /      \
label_1    label_2

Then you could do total_loss = loss_1(label_1) + loss_2(label_2). With loss_1 and loss_2 of your choosing. You'd then backpropagate the total_loss through the network to finetune the weights.

More in-depth example: https://towardsdatascience.com/journey-to-the-center-of-multi-label-classification-384c40229bff.

Nicolas Gervais · Answer 2 · 2020-08-19T10:38:36.100

0

In comparison with a standard multi-class task, you just need to change your activation function to 'sigmoid':

import tensorflow as tf
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

y = tf.one_hot(y, depth=3).numpy()

y[:, 0] = 1.

ds = tf.data.Dataset.from_tensor_slices((X, y)).shuffle(25).batch(8)

model = tf.keras.Sequential([
    Dense(16, activation='relu'),
    Dense(32, activation='relu'),
    Dense(3, activation='sigmoid')])

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = model.fit(ds, epochs=25)

Epoch 25/25
 1/19 [>.............................] - ETA: 0s - loss: 0.0418 - acc: 1.0000
19/19 [==============================] - 0s 2ms/step - loss: 1.3129 - acc: 1.0000

edited Aug 19 '20 at 10:38

answered Aug 19 '20 at 09:42

Nicolas Gervais

33,817
13
115
143

Doesn't the loss also need to be changed to binary cross-entropy? https://stackoverflow.com/questions/44164749/how-does-keras-handle-multilabel-classification – desertnaut Aug 19 '20 at 10:14
I'm inclined to say it's not necessary because I reached 100% accuracy. I might be missing something though – Nicolas Gervais Aug 19 '20 at 10:18
1

Well, that's a dangerous criterion to invoke :) https://stackoverflow.com/questions/42081257/why-binary-crossentropy-and-categorical-crossentropy-give-different-performances – desertnaut Aug 19 '20 at 10:20
I am not talking about a multi class one, but a multi lable one. Now, is it valid? – Bumuthu Dilshan Aug 19 '20 at 14:12
No one here spoke about multi-class – Nicolas Gervais Aug 19 '20 at 14:25

Classification for a case with multiple labels per data point

2 Answers2