41

I would like to make a deep copy of a keras model (called model1) of mine in order to be able to use it in a for a loop and then re-initialize for each for-loop iteration and perform fit with one additional sample to the model. I would like to be able to initialize the model after each iteration since after performing the fit (my model is modified however, I want it keep it as it is when i am loading from the path using the load_weights).

My code looks like:

model1= create_Model()
model1.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model1.load_weights('my_weights')

model_copy= create_Model()
model_copy.compile(optimizer='rmsprop', loss='categorical_crossentropy')

model_copy= keras.models.clone_model(model1)
for j in range(0, image_size):
      model_copy.fit(sample[j], sample_lbl[j])
      prediction= model_copy.predict(sample[j])

Also, it is not really efficient for me to load the model each time in the for-loop since that is time-consuming. How can I do properly the deep copy in my case? The code I posted give the following error that concerns the function .fit and my reference model model_copy:

RuntimeError: You must compile a model before training/testing. Use model.compile(optimizer, loss).

Jose Ramon
  • 5,572
  • 25
  • 76
  • 152

3 Answers3

39

The issue is that model_copy is probably not compiled after cloning. There are in fact a few issues:

  1. Apparently cloning doesn't copy over the loss function, optimizer info, etc.

  2. Before compiling you need to also build the model.

  3. Moreover, cloning doesn't copy weight over

So you need a couple extra lines after cloning. For example, for 10 input variables:

model_copy= keras.models.clone_model(model1)
model_copy.build((None, 10)) # replace 10 with number of variables in input layer
model_copy.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model_copy.set_weights(model.get_weights())


Easier Method 1: Loading weights from file

If I understand your question correctly, there is an easier way to do this. You don't need to clone the model, just need to save the old_weights and set the weights at beginning of the loop. You can simply load weights from file as you are doing.

for _ in range(10):
    model1= create_Model()
    model1.compile(optimizer='rmsprop', loss='categorical_crossentropy')
    model1.load_weights('my_weights')

    for j in range(0, image_size):
          model1.fit(sample[j], sample_lbl[j])
          prediction= model1.predict(sample[j])

Easier Method 2: Loading weights from previous get_weights()

Or if you prefer not to load from file:

model1= create_Model()
model1.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model1.load_weights('my_weights')
old_weights = model1.get_weights()

for _ in range(10):
    model1.set_weights(old_weights)
    for j in range(0, image_size):
          model1.fit(sample[j], sample_lbl[j])
          prediction= model1.predict(sample[j])
Tim
  • 3,178
  • 1
  • 13
  • 26
  • What about the state of the metrics? I would call `reset_metrics` every time in the loop too, but I am find this whole scheme confusing. – James Hirschorn Jan 12 '20 at 01:01
  • @JamesHirschorn, if you are talking about history, it is reset after each fit, see here: https://github.com/keras-team/keras/issues/6697. as for the state of each layer, it depends on whether the network is stateful, see here: https://stackoverflow.com/questions/42763928/how-to-use-model-reset-states-in-keras. I agree this is a but awkward. It may be possible to add an additional layer before the current model1 to split the samples and run these in parallel. – Tim Jan 14 '20 at 20:09
  • @Tim About your second easier method: If I have two separate networks (with the same architecture) and after a while I use set_weights into the first network collecting get_weights from the second network. Now, I have two networks with the same weights. If I modify the weights of the second (by training), without directly modifying the first network, will it affect the first network weights? Basically, what I'm asking is if set_weights/get_weights do a "shallow/deepcopy" instead of just "pointing" to a same object (Like "a = b" in Python). – ihavenoidea Mar 27 '20 at 21:13
  • I highly doubt it is a deep copy because Keras is only a wrapper for tensorflow. – Tim Mar 29 '20 at 13:44
  • @Tim So in my example, if I edit the first network's weights, will it affect second network's weights? – ihavenoidea Mar 30 '20 at 00:46
  • @ihavenoidea I should not affect the second network’s weights – Tim Mar 30 '20 at 01:34
15

These days it's trivial:

model2 = tf.keras.models.clone_model(model1)

This will give you a new model, new layers, and new weights. If for some reason that doesn't work (I haven't tested it) this older solution will:

model1 = Model(...)
model1.compile(...)
model1.save(savepath) # saves compiled state
model2 = keras.models.load_model(savepath)
markemus
  • 1,702
  • 15
  • 23
  • 1
    This gave me a massive headache. After creating several different models in several iterations, and attempting to compare, I constantly got very poor performance. I'm not sure exactly what the issue was, but switching from `tf.keras.models.clone_model` to the file save equivalent fixed my issue. TLDR; tensorflow handles models and weights weirdly, and it's easy to think you're handling both well, but aren't. – Warlax56 Dec 29 '22 at 00:50
0

A very general method to get deep copies in python is deepcopy from the copy package:

import copy
model2=copy.deepcopy(model)

Are there any disadvantages in using this for keras models?

Edit: As pointed out in the comment by GRASBOCK, this solution (copy.deepcopy) does not work reliably for tf-models: https://stackoverflow.com/a/64427748/5044463.

Jakob
  • 1,063
  • 9
  • 17
  • like you suspected; according to another answer this doesn't work with tf-models: https://stackoverflow.com/a/64427748/5044463 – GRASBOCK Jan 16 '23 at 09:54