I am trying to understand the loss function using Keras functional API. I have a sample multi-output model based on the B-CNN model.
img_input = Input(shape=input_shape, name='input')
#--- block 1 ---
x = Conv2D(32, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
#--- coarse 1 branch ---
c_1_bch = Flatten(name='c_flatten')(x)
c_1_bch = Dense(64, activation='relu', name='c_dense')(c_1_bch)
c_1_bch = BatchNormalization()(c_1_bch)
c_1_bch = Dropout(0.5)(c_1_bch)
c_1_pred = Dense(num_c, activation='softmax', name='pred_coarse')(c_1_bch)
#--- block 3 ---
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
#--- fine block ---
x = Flatten(name='flatten')(x)
x = Dense(128, activation='relu', name='fc_1')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
fine_pred = Dense(num_classes, activation='softmax', name='pred_fine')(x)
model = keras.Model(inputs= [img_input],
outputs= [c_1_pred, fine_pred],
name='B-CNN_Model')
This classification model takes one input and provides 2 predictions.
According to this post, we need to compile it first with the proper loss function, metrics, and optimizer by mentioning the name
variables for each output layer.
I have done this in the following way.
model.compile(optimizer = optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True),
loss={'pred_coarse':'mse',
'pred_fine':'categorical_crossentropy'},
loss_weights={'pred_coarse':beta,
'pred_fine':gamma},
metrics={'pred_coarse':'accuracy',
'pred_fine':'accuracy'})
[Note: Here, output layer pred_coarse
is using Mean Square Error and pred_fine
is using Categorical Cross Entropy loss function. The loss_weights beta
and gamma
are variable and update the value after certain epochs using keras.callbacks.Callback
function ]
Now, My question is, what happens if we compile the model without mentioning the name
variables for each output layer and provide only one function instead? For example, we compile the model as follows:
model.compile(optimizer=optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True),
loss='categorical_crossentropy',
loss_weights=[beta, gamma],
metrics=['accuracy'])
Unlike the previous compile example, this one uses the Categorical Cross Entropy loss function. The model compiles and runs without any errors. Does the model using Categorical Cross Entropy loss function for both pred_coarse
and pred_fine
output layers?