3

I am trying to train a custom CNN model in TensorFlow. I want somehow to freeze some layers of the model in specific epochs while the training is still running. I have achieved freezing the layers but i had to train the model for some epochs, then change the trainable attribute to False in specific layers i wanted to freeze, then compile the model, and the start training again.

I have tried to implement it using the CustomCallback() Class, and in certain epochs to freeze some layers, but it seemed that this didn't work. As far as TensorFlow mentions about changing the .trainable attribute of a layer, then you have to compile the model for the change to be applied at the model, but there is an error emerging, "TypeError: 'NoneType' object is not callable".

That is my code:

Load libraries

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.utils import Sequence
from keras.models import load_model

Load dataset

#Load dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
#Normalize
X_train, X_test = X_train/255.0, X_test/255.0

Build model

cnn = models.Sequential([
    
    layers.Conv2D(filters = 32, kernel_size = (1,1), padding = "same", activation = "relu", input_shape = (32,32,3)),
    layers.Conv2D(filters = 64, kernel_size = (3,3), padding = "same", activation = "relu"),
    layers.MaxPool2D(pool_size = (2,2)),
    
    layers.Conv2D(filters = 64, kernel_size = (3,3), padding = "same", activation = "relu"),
    layers.Conv2D(filters = 128, kernel_size = (5,5), padding = "same", activation = "relu"),
    layers.MaxPool2D(pool_size = (2,2)),
    
    layers.Flatten(),
    layers.Dense(64, activation = "relu"),
    layers.Dense(128, activation = "relu"),
    layers.Dense(64, activation = "relu"),
    layers.Dense(10, activation = "softmax")  
])

Create CustomCallback Class

class CustomCallback(tf.keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs = None):
        if epoch == 5:
            cnn.layers[0].trainable, cnn.layers[1].trainable, cnn.layers[2].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
        elif epoch == 10:
            cnn.layers[3].trainable, cnn.layers[4].trainable, cnn.layers[5].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
        elif epoch == 15:
            cnn.layers[6].trainable, cnn.layers[7].trainable, cnn.layers[8].trainable = (False, False, False)
            cnn.compile(optimizer = optimizer, loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

Define optimizer and compile

#Define the optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate = 0.001)

#Compile the model
cnn.compile(optimizer = optimizer , loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

Train model

results = cnn.fit(X_train, y_train, epochs = 20, validation_data = (X_test, y_test), batch_size = 1024, callbacks = [CustomCallback()])

An error pops-up "TypeError: 'NoneType' object is not callable". If i don't compile the model after freezing some layers it seems to not get an error, but while training all layers are updated in all epochs.

  • Please post the _full traceback_; as it stands, we have no idea where the error occurs, so it's difficult to help. – xdurch0 Feb 06 '23 at 15:02
  • Unfortunately, I don't think it's possible to recompile the model like this. See this [ticket](https://github.com/keras-team/keras/issues/17255#issuecomment-1328594424). – Innat Feb 06 '23 at 18:44

1 Answers1

0

OK as pointed out in order to change the status of a layer one has to recompile the model. So what I did was to train the model for 5 epochs. The I saved the weights to a file. Then I set layer 7 to not trainable. Then I recompiled the model. Then I loaded the saved weights into the model then ran 5 more epochs. At the end of those epochs I compared the weights with those I loaded and they are the same. So the code is shown below starting after the model was compiled:

print('{0:^8s}{1:^80s}{2:^12s}'. format('Layer', 'Layer Description', 'Trainable'))
for i, layer in enumerate(cnn.layers):    
    print( '{0:^8s}{1:^80s}{2:^12s}'. format(str(i), str(layer), str(layer.trainable)))

This just gives the information for each layer in the model per the printout shown below

Layer                                 Layer Description                                 Trainable  
   0            <keras.layers.convolutional.Conv2D object at 0x00000261CCB7A370>            True    
   1            <keras.layers.convolutional.Conv2D object at 0x00000261E55F4700>            True    
   2            <keras.layers.pooling.MaxPooling2D object at 0x00000261E55F4970>            True    
   3            <keras.layers.convolutional.Conv2D object at 0x00000261E567B160>            True    
   4            <keras.layers.convolutional.Conv2D object at 0x00000261E567B280>            True    
   5            <keras.layers.pooling.MaxPooling2D object at 0x00000261E55F44C0>            True    
   6            <keras.layers.core.flatten.Flatten object at 0x00000261E567B700>            True    
   7              <keras.layers.core.dense.Dense object at 0x00000261E567BD30>              True    
   8              <keras.layers.core.dense.Dense object at 0x00000261E5680070>              True    
   9              <keras.layers.core.dense.Dense object at 0x00000261E56802B0>              True    
   10             <keras.layers.core.dense.Dense object at 0x00000261E56805B0>              True    

Then I trained the model for 5 epochs and printed out the weights and biases code is below

history=cnn.fit(x=train_gen,   epochs=5, verbose=1,   validation_data=valid_gen,
                   validation_steps=None,  shuffle=True,  initial_epoch=0) # train the model
weights_and_biases=cnn.layers[7].get_weights()
weights=weights_and_biases[0]
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

The printout of the weights and biases at the end of the 5th epoch is shown below

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

Then I saved the weights to a file. I changed the state of layer 7 to not trainable and recompiled the model. After compiling I loaded the saved weights into the model and again printed out the weights and biases to make sure they loaded correctly. Code is below

filepath=r'C:\DATASETS\spiders\run1.h5' # save the weights at the end of 5 epochs to this file
cnn.save_weights(filepath, overwrite=True, save_format=None, options=None) # save the weights
cnn.layers[7].trainable=False # make layer 7 not trainable
cnn.compile(optimizer = optimizer , loss = "categorical_crossentropy", metrics = ["accuracy"]) # compile the model
cnn.load_weights(filepath, by_name=False, skip_mismatch=False, options=None) # load the model with the saved weights
weights_and_biases=cnn.layers[7].get_weights() #get the weights to make sure they are the same as at the end of epoch 5
weights=weights_and_biases[0] # print out the weights
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

The printed results shown below were as expected

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

Then I trained for 5 more epochs. At the end of those epochs I printed out the layer 7 weights which should not have changed. The code is shown below

history=cnn.fit(x=train_gen,   epochs=5, verbose=1,   validation_data=valid_gen,
                   validation_steps=None,  shuffle=True,  initial_epoch=0) # train the model
weights_and_biases=cnn.layers[7].get_weights()
weights=weights_and_biases[0]
print ('shape of weights is= ',weights.shape) # has 64 nodes receiving 131072 inputs from the flatten layer
biases=weights_and_biases[1]
print ('shape of biases is- ',biases.shape)
first_10_weights=weights[0][0:10]
print (first_10_weights)
first_10_biases=biases[0:10]
print (first_10_biases)

The resultant printout shown below shows the weights and biases did not change

shape of weights is=  (131072, 64)
shape of biases is-  (64,)
[-0.00171461 -0.00061654 -0.0004427   0.006399    0.00065272  0.00117902
  0.00206342 -0.00248441 -0.00172774  0.00399113]
[-0.0098094  -0.01114658 -0.00550008  0.00675221 -0.00647649  0.01904665
  0.0103933   0.01889692 -0.01373082  0.00189758]

So the process is build and compile your model. Run for N epochs. Save the weights to a file. Then change the training status of the layers. Recompile the model. Load the saved weights. Continue training.

Gerry P
  • 7,662
  • 3
  • 10
  • 20
  • That doesn't solve the problem. The main target is to make changes of training condition of a layers WHILE the model is training. I have implemented what you described also, and there is an easier way. You don't have to save the weights and load them again you just train for N epochs then make any changes you want, then recompile, and train again. – xaristeidou Feb 15 '23 at 07:34