0

In classification problems in machine learning, typically we use a single label for a single data point. How can we go ahead with multiple labels for a single data point?

As an example, suppose a character recognition problem. As the labels for a single image of a letter, we have the encoded values for both the letter and the font family. Then there are two labels per data point.

How can we make a keras deep learning model for this? Which hyperparameters should be changed compared with a single labelled problem?

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
Bumuthu Dilshan
  • 430
  • 5
  • 14

2 Answers2

1

In short, you let the model output two predictions.

        ...
previous-to-last layer
     /      \
label_1    label_2

Then you could do total_loss = loss_1(label_1) + loss_2(label_2). With loss_1 and loss_2 of your choosing. You'd then backpropagate the total_loss through the network to finetune the weights.

More in-depth example: https://towardsdatascience.com/journey-to-the-center-of-multi-label-classification-384c40229bff.

Aventau
  • 190
  • 1
  • 9
0

In comparison with a standard multi-class task, you just need to change your activation function to 'sigmoid':

import tensorflow as tf
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

y = tf.one_hot(y, depth=3).numpy()

y[:, 0] = 1.

ds = tf.data.Dataset.from_tensor_slices((X, y)).shuffle(25).batch(8)

model = tf.keras.Sequential([
    Dense(16, activation='relu'),
    Dense(32, activation='relu'),
    Dense(3, activation='sigmoid')])

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = model.fit(ds, epochs=25)
Epoch 25/25
 1/19 [>.............................] - ETA: 0s - loss: 0.0418 - acc: 1.0000
19/19 [==============================] - 0s 2ms/step - loss: 1.3129 - acc: 1.0000
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143