I encountered a strange thing in a testing-on-training experiment, where the val_loss
is completely different than the training loss
, even though they are evaluated on the exact same data (X,Y)
with the same batch_size
Below is the code that I used to train one batch
X, Y = valid_datagen.next()
batch_size = len(X[0])
joint_model.fit( X, Y,
batch_size=batch_size,
epochs=1,
verbose=1,
validation_data=(X, Y))
Train on 12 samples, validate on 12 samples Epoch 1/1 12/12 [==============================] - 38s 3s/step - loss: 0.7510 - q_mask_a_loss: 0.4739 - r_mask_a_loss: 0.6610 - q_mask_b_loss: 0.4718 - r_mask_b_loss: 0.3164 - pred_a_loss: 1.8092 - pred_b_loss: 0.2238 - q_mask_a_F1: 0.8179 - r_mask_a_F1: 0.5318 - q_mask_b_F1: 0.8389 - r_mask_b_F1: 0.6134 - pred_a_acc: 0.0833 - pred_b_acc: 1.0000 - val_loss: 7.0257 - val_q_mask_a_loss: 6.9748 - val_r_mask_a_loss: 14.9849 - val_q_mask_b_loss: 6.9748 - val_r_mask_b_loss: 14.9234 - val_pred_a_loss: 0.6919 - val_pred_b_loss: 0.6944 - val_q_mask_a_F1: 0.0000e+00 - val_r_mask_a_F1: 0.0000e+00 - val_q_mask_b_F1: 0.0000e+00 - val_r_mask_b_F1: 0.0000e+00 - val_pred_a_acc: 1.0000 - val_pred_b_acc: 0.0000e+00
Note:
- the training
loss
is0.7510
while theval_loss
is7.0257
. - I've already made the
batch_size
to be equal the number of samples, i.e. training only on one batch. - I am using
keras
2.2.0 withtensorflow
backend 1.5.0. - using
joint_model.evaluate( X, Y, batch_size=batch_size)
gives the same result as the validation.
With regard to the used joint_model
, it is nothing but a feed-forward CNN with frozen weights in the first several layers. No Dropout
layer anywhere.
I've completely no idea what's going on here. Does anyone what are potential reasons or how to debug this? Any suggestions are welcome.