0

I am trying to understand the loss function using Keras functional API. I have a sample multi-output model based on the B-CNN model.

img_input = Input(shape=input_shape, name='input')

#--- block 1 ---
x = Conv2D(32, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

#--- coarse 1 branch ---
c_1_bch = Flatten(name='c_flatten')(x)
c_1_bch = Dense(64, activation='relu', name='c_dense')(c_1_bch)
c_1_bch = BatchNormalization()(c_1_bch)
c_1_bch = Dropout(0.5)(c_1_bch)
c_1_pred = Dense(num_c, activation='softmax', name='pred_coarse')(c_1_bch)

#--- block 3 ---
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

#--- fine block ---
x = Flatten(name='flatten')(x)
x = Dense(128, activation='relu', name='fc_1')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
fine_pred = Dense(num_classes, activation='softmax', name='pred_fine')(x)


model = keras.Model(inputs= [img_input],
                        outputs= [c_1_pred, fine_pred],
                        name='B-CNN_Model')

This classification model takes one input and provides 2 predictions. According to this post, we need to compile it first with the proper loss function, metrics, and optimizer by mentioning the name variables for each output layer.

I have done this in the following way.

model.compile(optimizer = optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True),
              loss={'pred_coarse':'mse',
                    'pred_fine':'categorical_crossentropy'}, 
              loss_weights={'pred_coarse':beta,
                            'pred_fine':gamma},
              metrics={'pred_coarse':'accuracy',
                       'pred_fine':'accuracy'})

[Note: Here, output layer pred_coarse is using Mean Square Error and pred_fine is using Categorical Cross Entropy loss function. The loss_weights beta and gamma are variable and update the value after certain epochs using keras.callbacks.Callback function ]

Now, My question is, what happens if we compile the model without mentioning the name variables for each output layer and provide only one function instead? For example, we compile the model as follows:

model.compile(optimizer=optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True), 
              loss='categorical_crossentropy', 
              loss_weights=[beta, gamma],
              metrics=['accuracy'])

Unlike the previous compile example, this one uses the Categorical Cross Entropy loss function. The model compiles and runs without any errors. Does the model using Categorical Cross Entropy loss function for both pred_coarse and pred_fine output layers?

tasrif
  • 78
  • 8
  • 1
    Yes, your model would apply categorical cross entropy for all outputs. Your output should be like your input's labels, so the labels of `img_input` should be two elements per feature. – Djinn Oct 20 '22 at 04:35
  • Thank you @Djinn for clearing the concept. Btw, What do you mean by "labels of `img_input` should be two elements per feature"? Does it mean the number of protection classes should be two? In that case, it can be more than two classes. – tasrif Oct 20 '22 at 04:59
  • If you have features that map to two labels in the output, you'll need similarly structured labels in the input to compare against when calculating loss. If feature `x` corresponds with output `[[1], [2]]`, the only way to calculate loss is to compare with the true label, for example `[[3], [4]]`. It has nothing to do with the number of classes. – Djinn Oct 20 '22 at 07:13

0 Answers0