6

Suppose you have a Keras model with n neurons as the output, where each neuron is associated to a regression variable (e.g. speed of a car, height of a car, ...), as in the following code snippet:

# define Keras model
input_layer = Input(shape=shape)
... # e.g. conv layers
x = Dense(n, activation='linear')(x)
model = Model(inputs=input_layer, outputs=x)

model.compile(loss='mean_absolute_error', optimizer='sgd', metrics=['mean_squared_error'])

history = model.fit_generator(...)

Now, the MAE loss that is stored in the history dictionary is a single number, which is calculated based on the n - dimensional y_pred and y_true arrays. Thus, the single loss value is averaged over the individual losses for the n labels, as it can be seen in the Keras MAE function:

def mean_absolute_error(y_true, y_pred):
    return K.mean(K.abs(y_pred - y_true), axis=-1)

However, I'd like to get a history object that contains the loss for each of the n labels, i.e. {loss: {'speed': loss_value_speed, 'height': loss_value_height}}. And ideally, the progress bar during the training should also show the individual losses and not the combined one.

How can I do that?

I suppose that one could write a custom metric for each output neuron, which calculates the loss for only a single index in the y_pred and y_true vectors, but that feels like a workaround:

def mean_absolute_error_label_0(y_true, y_pred):
    # calculate the loss only for the first label, label_0
    return K.mean(K.abs(y_pred[0] - y_true[0]), axis=-1)
0vbb
  • 839
  • 11
  • 27
  • 2
    You can return an array with your loss for every of feature and then use callback functions like in my answer from here: https://stackoverflow.com/questions/50061393/how-to-get-results-from-custom-loss-function-in-keras/50061570#50061570 – Mihai Alexandru-Ionut May 25 '18 at 09:29
  • Thank you! Unfortunately, I still have a problem: I've defined my custom metric as ```return K.abs(y_pred - y_true)```, which returns a TF Tensor of shape 5. However, if I access the metric through the logs dict, the metric is a scalar and not a vector any more? – 0vbb May 25 '18 at 10:53
  • I think you forgot to pass the metric to `metrics` property of compile method : `metrics=['mean_squared_error', mean_absolute_error])` – Mihai Alexandru-Ionut May 25 '18 at 11:02
  • Mhm, I don't think so. I did it like in this code snippet : https://pastebin.com/P1C80gi0 – 0vbb May 25 '18 at 11:13
  • 1
    You can't get an array from `logs`. The loss values inserted into `logs` are averaged scalar values. – Yu-Yang May 25 '18 at 18:25

1 Answers1

4

A possible solution is to use a separate output layer for each target, and assign a name for each of them (i.e., Dense(1, name='...')). In your case it would be the same as training with a Dense(n) output layer, since the total loss is just the sum of the individual losses.

For example,

input_layer = Input(shape=(1000,))
x = Dense(100)(input_layer)

# name each output layer
target_names = ('speed', 'height')
outputs = [Dense(1, name=name)(x) for name in target_names]

model = Model(inputs=input_layer, outputs=outputs)
model.compile(loss='mean_absolute_error', optimizer='sgd', metrics=['mean_squared_error'])

Now when you fit the model, you should be able to see the losses (and metrics) for each target separately.

X = np.random.rand(10000, 1000)
y = [np.random.rand(10000) for _ in range(len(outputs))]
history = model.fit(X, y, epochs=3)

Epoch 1/1
10000/10000 [==============================] - 1s 127us/step - loss: 0.9714 - speed_loss: 0.4768 - height_loss: 0.4945 - speed_mean_squared_error: 0.5253 - height_mean_squared_error: 0.5939
Epoch 2/3
10000/10000 [==============================] - 1s 101us/step - loss: 0.5109 - speed_loss: 0.2569 - height_loss: 0.2540 - speed_mean_squared_error: 0.0911 - height_mean_squared_error: 0.0895
Epoch 3/3
10000/10000 [==============================] - 1s 107us/step - loss: 0.5040 - speed_loss: 0.2529 - height_loss: 0.2511 - speed_mean_squared_error: 0.0873 - height_mean_squared_error: 0.0862

The losses saved to the returned history object will also be named.

print(history.history)

{'height_loss': [0.49454938204288484, 0.2539591451406479, 0.25108356306552887],
 'height_mean_squared_error': [0.5939331066846848,
  0.08951960142850876,
  0.08619525188207626],
 'loss': [0.9713814586639404, 0.5108571118354798, 0.5040025643348693],
 'speed_loss': [0.47683207807540895, 0.25689796624183653, 0.25291900217533114],
 'speed_mean_squared_error': [0.5252606071352959,
  0.09107607080936432,
  0.0872862442612648]}

EDIT: If the loss of output height depends on the value of speed, you can:

  • Concatenate the outputs, because you'll need both values to compute the custom loss
  • Name the Concatenate layer "height", this will be the output for height in the history object
  • Provide two loss functions to model.compile() (one for speed and one for the concatenated output height)
def custom_loss(y_true, y_pred):
    y_pred_height = y_pred[:, 0]
    y_pred_speed = y_pred[:, 1]

    # some loss which depends on the value of `speed`
    loss = losses.mean_absolute_error(y_true, y_pred_height * y_pred_speed)
    return loss

input_layer = Input(shape=(1000,))
x = Dense(100, activation='relu')(input_layer)

output_speed = Dense(1, activation='relu', name='speed')(x)
output_height = Dense(1, activation='relu')(x)
output_merged = Concatenate(name='height')([output_height, output_speed])

model = Model(inputs=input_layer, outputs=[output_speed, output_merged])
model.compile(loss={'speed': 'mean_absolute_error', 'height': custom_loss},
              optimizer='sgd',
              metrics={'speed': 'mean_squared_error'})

The output will be:

X = np.random.rand(10000, 1000)
y = [np.random.rand(10000), np.random.rand(10000)]
history = model.fit(X, y, epochs=3)

Epoch 1/3
10000/10000 [==============================] - 5s 501us/step - loss: 1.0001 - speed_loss: 0.4976 - height_loss: 0.5026 - speed_mean_squared_error: 0.3315
Epoch 2/3
10000/10000 [==============================] - 2s 154us/step - loss: 0.9971 - speed_loss: 0.4960 - height_loss: 0.5011 - speed_mean_squared_error: 0.3285
Epoch 3/3
10000/10000 [==============================] - 1s 149us/step - loss: 0.9971 - speed_loss: 0.4960 - height_loss: 0.5011 - speed_mean_squared_error: 0.3285.

print(history.history)
{'height_loss': [0.502568191242218, 0.5011419380187988, 0.5011419407844544],
 'loss': [1.0001451692581176, 0.9971360887527466, 0.9971360870361328],
 'speed_loss': [0.4975769768714905, 0.4959941484451294, 0.4959941472053528],
 'speed_mean_squared_error': [0.33153974375724793,
                              0.32848617186546325,
                              0.32848617215156556]}
Yu-Yang
  • 14,539
  • 2
  • 55
  • 62
  • Thank you for the great answer, it has helped me a lot! I have an additional question though: suppose that the loss for the output 'height' is a custom one, which needs the y_pred predictions from the 'speed' output. Is this possible? I could only imagine to concatenate the outputs into a single output again, and then the losses history for the single losses would be lost. – 0vbb Jun 18 '18 at 07:23
  • 1
    @0vbb Yes it's possible. Please see my edit if it's what you want. – Yu-Yang Jun 18 '18 at 10:22
  • Unfortunately I ran into another issue with this method: Suppose that speed and height are connected to two separate dense networks, i.e. `speed` to `dense_nn_1` and `height` to `dense_nn_2`. Both networks are then connected to a main network `main_nn`. Now, I want to stop the gradient from `dense_nn_1` to the `main_nn` by putting `K.stop_gradient` in between the two networks. However, due to the `Concatenate` layer, the `speed` loss will still be propagated over the `dense_nn_2` from `height`. Is it possible to fix this? – 0vbb Jul 13 '18 at 10:11
  • 1
    I think using `y_pred_speed = K.stop_gradient(y_pred[:, 1])` in `custom_loss` can fix it. – Yu-Yang Jul 16 '18 at 01:53
  • @Yu-Yang Can you please take a look into this question? https://stackoverflow.com/q/58900947/5904928 I am struggling for hours, couldn't find an answer. – Aaditya Ura Nov 17 '19 at 13:46