1

I'm finetuning a keras model that outputs 3 different predictions for 3 subtasks. The model output is a list :

out = [[batch_size,5],[batch_size,6],[batch_size,6]]

I only want to compute the categorical cross entropy loss for the 3rd output. So I defined a simple custom function:

def my_loss_fn(y_true, y_pred):
        out = y_pred[-1]
        return tf.keras.losses.CategoricalCrossentropy()(y_true, out) 

However, tensorflow is complaining that ValueError: Shapes (96, 6) and (5,) are incompatible.

It seems as though y_pred[-1] only returns elements from the final index of the model's first output.

How do I ignore first to model output and only consider teh last output to compute the loss ?

Innat
  • 16,113
  • 6
  • 53
  • 101
HuckleberryFinn
  • 1,489
  • 2
  • 16
  • 26

1 Answers1

1

We can define loss founction for each output of multi-output model. For that, use naming of the last layers (output layers) of the model. One of a way to achieve this by the following way.

import tensorflow as tf
from tensorflow.keras import utils
import numpy as np  

(xtrain, ytrain), (_, _) = keras.datasets.mnist.load_data()
y_out_a = utils.to_categorical(ytrain, num_classes=10) 
y_out_b = (ytrain % 2 == 0).astype('float32')
y_out_c = tf.square(tf.cast(ytrain, tf.float32))
batch_size = 32
data_image = tf.data.Dataset.from_tensor_slices(
     xtrain[..., None]
)
data_label = tf.data.Dataset.from_tensor_slices(
     (y_out_a, y_out_b, y_out_c)
)
dataset = tf.data.Dataset.zip((data_image, data_label))
dataset = dataset.shuffle(buffer_size=8 * batch_size)
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)

x, y = next(iter(dataset))
y[0].shape, y[1].shape, y[2].shape
(TensorShape([32, 10]), TensorShape([32]), TensorShape([32]))
input = keras.Input(shape=(28, 28, 1))
x = layers.Flatten()(input)
x = layers.Dense(128, activation='relu')(x)
out_a = keras.layers.Dense(10, activation='softmax', name='10cls')(x)
out_b = keras.layers.Dense(1, activation='sigmoid', name='2cls')(x)
out_c = keras.layers.Dense(1, activation='linear', name='1rg')(x)
func_model = keras.Model(
    inputs=[input], outputs=[out_a, out_b, out_c]
)
def categorical(y_true, y_pred):
    return keras.losses.CategoricalCrossentropy()(y_true, y_pred) 

def binary(y_true, y_pred):
    return keras.losses.BinaryCrossentropy()(y_true, y_pred) 

def mse(y_true, y_pred):
    return keras.losses.MeanSquaredError()(y_true, y_pred) 

# compile the model with target loss fn
func_model.compile(
    # you can use what you want
    loss = {
        "10cls": categorical,
        # "2cls": binary,
        # "1rg": mse,
    },
    optimizer = keras.optimizers.Adam()
)

func_model.fit(
    dataset.take(100), 
)
4ms/step - loss: 17.5582 - 10cls_loss: 17.5582

Some resource, this may also help

Innat
  • 16,113
  • 6
  • 53
  • 101
  • hi, thanks for the help. This is exactly what I've been doing, yet the issue persists. To provide some context, i have a pre-trained model that already produces two outputs. I have named these outputs when defining the architecture. In the "fine-tuning" stage, I froze the entire pretrained model but unlike the usual finetuning case, I want to retain the pretrained model outputs as well as finetune a separate branch that produces a third output. So in a way, its like teaching the model a new trick, but retaining its past outputs. – HuckleberryFinn Mar 02 '23 at 14:28
  • Could you please use my above example and modify it according ot yours (modeling as you said) and share it via colab. It would be helpful to understand your main issue. – Innat Mar 02 '23 at 17:12