Why is Adam.iterations always set to 0 in Keras?

Question

I am currently trying to get into neural network building via keras/tensorflow and working through some example problems. At the moment I try to understand how to properly save and load my current model via model.save()/.load(). I would expect that, should everything be set up properly, loading a pre-trained model and continue the training should not spoil my prior accuracies and simply continue exactly where I left off.

However, it doesn't. My accuracies start to largely fluctuate after I load the model and need a while to actually return to my prior accuracies:

First Run

Continued Run

After digging through various threads with possible explanations (none of them were applicable to my findings) I think I figured out the reason:

~~I use tf.keras.optimizers.Adam for my weight optimization and after checking its initializer~~

  def __init__(self, [...], **kwargs):
    super(Adam, self).__init__(**kwargs)
    with K.name_scope(self.__class__.__name__):
      self.iterations = K.variable(0, dtype='int64', name='iterations')
      [...]

  def get_config(self):
    config = {
        'lr': float(K.get_value(self.lr)),
        'beta_1': float(K.get_value(self.beta_1)),
        'beta_2': float(K.get_value(self.beta_2)),
        'decay': float(K.get_value(self.decay)),
        'epsilon': self.epsilon,
        'amsgrad': self.amsgrad
    }

it seems as if the "iterations" counter is always reset to 0 and its current value is neither stored nor loaded when the entire model is saved as its not part of the config dict. This seems to contradict the statement that model.save saves "the state of the optimizer, allowing to resume training exactly where you left off." (https://keras.io/getting-started/faq/). Since the iterations counter is the one which steers the exponential "dropout" of the learning rate in the Adam algorithm

          1. / (1. + self.decay * math_ops.cast(self.iterations,
                                                K.dtype(self.decay))))

my model will always restart with the initial "large" learning rate, even if I set the "initial_epoch" parameter in model.fit() to the actual epoch number where my model was saved (see images uploaded above).

So my questions are:

Is this intended behavior?
If so, how is this in agreement with the cited statement from the keras FAQ that model.save() "resumes training exactly where you left off"?
Is there a way to actually save and restore the Adam optimizer including the iterations counter without writing my own optimizer (I already discovered that this is a possible solution but I was wondering if there really is no simpler method)

Edit I found the reason/solution: I called model.compile after load_model and this resets the optimizer while keeping the weights (see also Does model.compile() initialize all the weights and biases in Keras (tensorflow backend)? )

The epoch is a parameter of the Keras fit API. ```initial_epoch: Integer. Epoch at which to start training (useful for resuming a previous training run).```. It defaults to zero. So the issue is not necessarily with model.save but with the initial_epoch parameter after load. — Pedro Marques, Jul 02 '19 at 11:24
As stated in my original question (I now marked the bit), I _do_ set the initial_epoch. You can also see this in my plots. — DrDaDe, Jul 02 '19 at 12:38

score 0 · Answer 1 · answered Jul 02 '19 at 15:24

The iterations value is restored as can be seen in the code snippet below.

model.save('dense_adam_keras.h5')
mdl = load_model('dense_adam_keras.h5')

print('iterations is ', K.get_session().run(mdl.optimizer.iterations))

iterations is  46

When the 'load_model' is called, deserialize method is invoked to create the optimizer object and then set_weights method is called to restore the optimizer state from the saved weights.

https://github.com/keras-team/keras/blob/613aeff37a721450d94906df1a3f3cc51e2299d4/keras/optimizers.py#L742

https://github.com/keras-team/keras/blob/613aeff37a721450d94906df1a3f3cc51e2299d4/keras/optimizers.py#L103

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizers.py

Why is Adam.iterations always set to 0 in Keras?

1 Answers1

Linked