Keras: show loss for each label in a multi-label regression

Question

Suppose you have a Keras model with n neurons as the output, where each neuron is associated to a regression variable (e.g. speed of a car, height of a car, ...), as in the following code snippet:

# define Keras model
input_layer = Input(shape=shape)
... # e.g. conv layers
x = Dense(n, activation='linear')(x)
model = Model(inputs=input_layer, outputs=x)

model.compile(loss='mean_absolute_error', optimizer='sgd', metrics=['mean_squared_error'])

history = model.fit_generator(...)

Now, the MAE loss that is stored in the history dictionary is a single number, which is calculated based on the n - dimensional y_pred and y_true arrays. Thus, the single loss value is averaged over the individual losses for the n labels, as it can be seen in the Keras MAE function:

def mean_absolute_error(y_true, y_pred):
    return K.mean(K.abs(y_pred - y_true), axis=-1)

However, I'd like to get a history object that contains the loss for each of the n labels, i.e. {loss: {'speed': loss_value_speed, 'height': loss_value_height}}. And ideally, the progress bar during the training should also show the individual losses and not the combined one.

How can I do that?

I suppose that one could write a custom metric for each output neuron, which calculates the loss for only a single index in the y_pred and y_true vectors, but that feels like a workaround:

def mean_absolute_error_label_0(y_true, y_pred):
    # calculate the loss only for the first label, label_0
    return K.mean(K.abs(y_pred[0] - y_true[0]), axis=-1)

You can return an array with your loss for every of feature and then use callback functions like in my answer from here: https://stackoverflow.com/questions/50061393/how-to-get-results-from-custom-loss-function-in-keras/50061570#50061570 — Mihai Alexandru-Ionut, May 25 '18 at 09:29
Thank you! Unfortunately, I still have a problem: I've defined my custom metric as ```return K.abs(y_pred - y_true)```, which returns a TF Tensor of shape 5. However, if I access the metric through the logs dict, the metric is a scalar and not a vector any more? — 0vbb, May 25 '18 at 10:53
I think you forgot to pass the metric to `metrics` property of compile method : `metrics=['mean_squared_error', mean_absolute_error])` — Mihai Alexandru-Ionut, May 25 '18 at 11:02
Mhm, I don't think so. I did it like in this code snippet : https://pastebin.com/P1C80gi0 — 0vbb, May 25 '18 at 11:13
You can't get an array from `logs`. The loss values inserted into `logs` are averaged scalar values. — Yu-Yang, May 25 '18 at 18:25

Yu-Yang · Accepted Answer · 2018-06-18T13:30:24.210

A possible solution is to use a separate output layer for each target, and assign a name for each of them (i.e., Dense(1, name='...')). In your case it would be the same as training with a Dense(n) output layer, since the total loss is just the sum of the individual losses.

For example,

input_layer = Input(shape=(1000,))
x = Dense(100)(input_layer)

# name each output layer
target_names = ('speed', 'height')
outputs = [Dense(1, name=name)(x) for name in target_names]

model = Model(inputs=input_layer, outputs=outputs)
model.compile(loss='mean_absolute_error', optimizer='sgd', metrics=['mean_squared_error'])

Now when you fit the model, you should be able to see the losses (and metrics) for each target separately.

X = np.random.rand(10000, 1000)
y = [np.random.rand(10000) for _ in range(len(outputs))]
history = model.fit(X, y, epochs=3)

Epoch 1/1
10000/10000 [==============================] - 1s 127us/step - loss: 0.9714 - speed_loss: 0.4768 - height_loss: 0.4945 - speed_mean_squared_error: 0.5253 - height_mean_squared_error: 0.5939
Epoch 2/3
10000/10000 [==============================] - 1s 101us/step - loss: 0.5109 - speed_loss: 0.2569 - height_loss: 0.2540 - speed_mean_squared_error: 0.0911 - height_mean_squared_error: 0.0895
Epoch 3/3
10000/10000 [==============================] - 1s 107us/step - loss: 0.5040 - speed_loss: 0.2529 - height_loss: 0.2511 - speed_mean_squared_error: 0.0873 - height_mean_squared_error: 0.0862

The losses saved to the returned history object will also be named.

print(history.history)

{'height_loss': [0.49454938204288484, 0.2539591451406479, 0.25108356306552887],
 'height_mean_squared_error': [0.5939331066846848,
  0.08951960142850876,
  0.08619525188207626],
 'loss': [0.9713814586639404, 0.5108571118354798, 0.5040025643348693],
 'speed_loss': [0.47683207807540895, 0.25689796624183653, 0.25291900217533114],
 'speed_mean_squared_error': [0.5252606071352959,
  0.09107607080936432,
  0.0872862442612648]}

EDIT: If the loss of output height depends on the value of speed, you can:

Concatenate the outputs, because you'll need both values to compute the custom loss
Name the Concatenate layer "height", this will be the output for height in the history object
Provide two loss functions to model.compile() (one for speed and one for the concatenated output height)

def custom_loss(y_true, y_pred):
    y_pred_height = y_pred[:, 0]
    y_pred_speed = y_pred[:, 1]

    # some loss which depends on the value of `speed`
    loss = losses.mean_absolute_error(y_true, y_pred_height * y_pred_speed)
    return loss

input_layer = Input(shape=(1000,))
x = Dense(100, activation='relu')(input_layer)

output_speed = Dense(1, activation='relu', name='speed')(x)
output_height = Dense(1, activation='relu')(x)
output_merged = Concatenate(name='height')([output_height, output_speed])

model = Model(inputs=input_layer, outputs=[output_speed, output_merged])
model.compile(loss={'speed': 'mean_absolute_error', 'height': custom_loss},
              optimizer='sgd',
              metrics={'speed': 'mean_squared_error'})

The output will be:

X = np.random.rand(10000, 1000)
y = [np.random.rand(10000), np.random.rand(10000)]
history = model.fit(X, y, epochs=3)

Epoch 1/3
10000/10000 [==============================] - 5s 501us/step - loss: 1.0001 - speed_loss: 0.4976 - height_loss: 0.5026 - speed_mean_squared_error: 0.3315
Epoch 2/3
10000/10000 [==============================] - 2s 154us/step - loss: 0.9971 - speed_loss: 0.4960 - height_loss: 0.5011 - speed_mean_squared_error: 0.3285
Epoch 3/3
10000/10000 [==============================] - 1s 149us/step - loss: 0.9971 - speed_loss: 0.4960 - height_loss: 0.5011 - speed_mean_squared_error: 0.3285.

print(history.history)
{'height_loss': [0.502568191242218, 0.5011419380187988, 0.5011419407844544],
 'loss': [1.0001451692581176, 0.9971360887527466, 0.9971360870361328],
 'speed_loss': [0.4975769768714905, 0.4959941484451294, 0.4959941472053528],
 'speed_mean_squared_error': [0.33153974375724793,
                              0.32848617186546325,
                              0.32848617215156556]}

Thank you for the great answer, it has helped me a lot! I have an additional question though: suppose that the loss for the output 'height' is a custom one, which needs the y_pred predictions from the 'speed' output. Is this possible? I could only imagine to concatenate the outputs into a single output again, and then the losses history for the single losses would be lost. — 0vbb, Jun 18 '18 at 07:23
@0vbb Yes it's possible. Please see my edit if it's what you want. — Yu-Yang, Jun 18 '18 at 10:22
Unfortunately I ran into another issue with this method: Suppose that speed and height are connected to two separate dense networks, i.e. `speed` to `dense_nn_1` and `height` to `dense_nn_2`. Both networks are then connected to a main network `main_nn`. Now, I want to stop the gradient from `dense_nn_1` to the `main_nn` by putting `K.stop_gradient` in between the two networks. However, due to the `Concatenate` layer, the `speed` loss will still be propagated over the `dense_nn_2` from `height`. Is it possible to fix this? — 0vbb, Jul 13 '18 at 10:11
I think using `y_pred_speed = K.stop_gradient(y_pred[:, 1])` in `custom_loss` can fix it. — Yu-Yang, Jul 16 '18 at 01:53
@Yu-Yang Can you please take a look into this question? https://stackoverflow.com/q/58900947/5904928 I am struggling for hours, couldn't find an answer. — Aaditya Ura, Nov 17 '19 at 13:46

Keras: show loss for each label in a multi-label regression

1 Answers1