Different between fit and evaluate in keras

Question

I have used 100000 samples to train a general model in Keras and achieve good performance. Then, for a particular sample, I want to use the trained weights as initialization and continue to optimize the weights to further optimize the loss of the particular sample.

However, the problem occurred. First, I load the trained weight by the keras API easily, then, I evaluate the loss of the one particular sample, and the loss is close to the loss of the validation loss during the training of the model. I think it is normal. However, when I use the trained weight as the inital and further optimize the weight over the one sample by model.fit(), the loss is really strange. It is much higher than the evaluate result and gradually became normal after several epochs.

I think it is strange that, for the same one simple and loading the same model weight, why the model.fit() and model.evaluate() return different results. I used batch normalization layers in my model and I wonder that it may be the reason. The result of model.evaluate() seems normal, as it is close to what I seen in the validation set before.

So what cause the different between fit and evaluation? How can I solve it?

score 1 · Answer 1 · answered Dec 23 '18 at 20:42

I think your core issue is that you are observing two different loss values during fit and evaluate. This has been extensively discussed here, here, here and here.

The fit() function loss includes contributions from:

Regularizers: L1/L2 regularization loss will be added during training, increasing the loss value
Batch norm variations: during batch norm, running mean and variance of the batch will be collected and then those statistics will be used to perform normalization irrespective of whether batch norm is set to trainable or not. See here for more discussion on that.
Multiple batches: Of course, the training loss will be averaged over multiple batches. So if you take average of first 100 batches and evaluate on the 100th batch only, the results will be different.

Whereas for evaluate, just do forward propagation and you get the loss value, nothing random here.

Bottomline is, you should not compare train and validation loss (or fit and evaluate loss). Those functions do different things. Look for other metrics to check if your model is training fine.

really thank you for your kind help! I think the BN layer is the reason, but I still don't know how to solve it. I have explained my problem specifically in a new problem, can you kindly have a look on it ? Thank you very much https://stackoverflow.com/questions/53911702/fine-tune-with-batch-normalization-in-keras — LinTIna, Dec 24 '18 at 09:46

Different between fit and evaluate in keras

1 Answers1