How to train the network only on one output when there are multiple outputs?

Question

I am using a multiple output model in Keras

model1 = Model(input=x, output=[y2, y3])

model1.compile((optimizer='sgd', loss=cutom_loss_function)

my custom_loss function is

def custom_loss(y_true, y_pred):
   y2_pred = y_pred[0]
   y2_true = y_true[0]

   loss = K.mean(K.square(y2_true - y2_pred), axis=-1)
   return loss

I only want to train the network on output y2.

What is the shape/structure of the y_pred and y_true argument in loss function when multiple outputs are used? Can I access them as above? Is it y_pred[0] or y_pred[:,0]?

score 23 · Accepted Answer · answered Jun 09 '17 at 06:53

23

I only want to train the network on output y2.

Based on Keras functional API guide you can achieve that with

model1 = Model(input=x, output=[y2,y3])   
model1.compile(optimizer='sgd', loss=custom_loss_function,
                  loss_weights=[1., 0.0])

What is the shape/structure of the y_pred and y_true argument in loss function when multiple outputs are used? Can I access them as above? Is it y_pred[0] or y_pred[:,0]

In keras multi-output models loss function is applied for each output separately. In pseudo-code:

loss = sum( [ loss_function( output_true, output_pred ) for ( output_true, output_pred ) in zip( outputs_data, outputs_model ) ] )

The functionality to do loss function on multiple outputs seems unavailable to me. One probably could achieve that by incorporating the loss function as a layer of the network.

answered Jun 09 '17 at 06:53

Sharapolas

356
3
7

1

`In keras multi-output models loss function is applied for each output separately.` I have a similar problem and I sepearately need the y_true and y_pred value of two separate output. how can I solve this? – Eka Jan 21 '18 at 05:20
6

Unless the framework changed recently the easiest solution is to concatenate the outputs into a single loss function and then to handle them there – Sharapolas Jan 24 '18 at 17:31
@Sharapolas Do you have a practical example of this statement `the easiest solution is to concatenate the outputs into a single loss function and then to handle them there` ? – ihavenoidea Nov 18 '19 at 22:37

score 5 · Answer 2 · answered Feb 10 '21 at 21:55

5

The accepted answer won't work in general if the custom loss can't be applied to the outputs you're trying to ignore, e.g. if they have the wrong shapes. In that case you can assign a dummy loss function to those outputs:

labels = [labels_for_relevant_output, dummy_labels_for_ignored_output]

def dummy_loss(y_true, y_pred):
    return 0.0

model.compile(loss = [custom_loss_function, dummy_loss])
model.fit(x, labels)

answered Feb 10 '21 at 21:55

Elan

69
1
2

Note that one might also have to change metrics, such that they specify which output they belong too. Is done by passing a dictionary of metrics, where the key is the layer/output name to map to. – Jon Nordby Oct 17 '21 at 13:57

score 2 · Answer 3 · edited Jul 12 '19 at 05:08

Sharapolas' answer is right.

However, there is a better way than using a layer for building custom loss functions with complex interdependence of several outputs of a model.

The method I know is being used in practice is by never calling model.compile but only model._make_predict_function(). From there on, you can go on and build a custom optimizer method by calling model.output in there. This will give you all outputs, [y2,y3] in your case. When doing your magic with it, get a keras.optimizer and use it's get_update method using your model.trainable_weights and your loss. Finally, return a keras.function with a list of the inputs required (in your case only model.input) and the updates you just got from the optimizer.get_update call. This function now replaces model.fit.

The above is often used in PolicyGradient algorithms, like A3C or PPO. Here is an example of what I tried to explain: https://github.com/Hyeokreal/Actor-Critic-Continuous-Keras/blob/master/a2c_continuous.py Look at build_model and critic_optimizer methods and read kreas.backend.function documentation to understand what happens.

I found this way to have frequently problems with the session management and does not appear to work in tf-2.0 keras at all currently. Hence, if anyone knows a method, please let me know. I came here looking for one :)

How to train the network only on one output when there are multiple outputs?

3 Answers3

Linked