3

I want to finetune efficientnet using tf.keras (tensorflow 2.3) but i cannot change the training status of layers properly. My model looks like this:

data_augmentation_layers = tf.keras.Sequential([
 keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
 keras.layers.experimental.preprocessing.RandomRotation(0.8)])

efficientnet = EfficientNetB3(weights="imagenet", include_top=False,
                                input_shape=(*img_size, 3))

#Setting to not trainable as described in the standard keras FAQ
efficientnet.trainable = False

inputs = keras.layers.Input(shape=(*img_size, 3))
augmented = augmentation_layers(inputs)
base = efficientnet(augmented, training=False)
pooling = keras.layers.GlobalAveragePooling2D()(base)
outputs = keras.layers.Dense(5, activation="softmax")(pooling)

model = keras.Model(inputs=inputs, outputs=outputs)

model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])

This is done so that my random weights on the custom top wont destroy the weights asap.

    Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 512, 512, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 512, 512, 3)       0         
_________________________________________________________________
efficientnetb3 (Functional)  (None, 16, 16, 1536)      10783535  
_________________________________________________________________
global_average_pooling2d (Gl (None, 1536)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 7685      
=================================================================
Total params: 10,791,220
Trainable params: 7,685
Non-trainable params: 10,783,535

Everything seems to work until this point. I train my model for 2 epochs and then i want to start fine-tuning the efficientnet base. Thus i call

for l in model.get_layer("efficientnetb3").layers:
  if not isinstance(l, keras.layers.BatchNormalization):
    l.trainable = True

model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])

I recompiled and print the summary again to see that the number of non-trainable weights remained the same. Also fitting does not bring better results that keeping frozen.

 dense (Dense)                (None, 5)                 7685      
    =================================================================
    Total params: 10,791,220
    Trainable params: 7,685
    Non-trainable params: 10,783,535

Ps: I also tried efficientnet3.trainable = True but this also had no effect.

Could it be that it has something to do with the fact that i'm using a sequential and a functional model at the same time?

J-H
  • 1,795
  • 6
  • 22
  • 41
  • I just found out that iterating over the layers of my basemodel and checking if they are trainable returns True. However, when i compile and print the summary, they are listed as non-trainable. – J-H Feb 01 '21 at 14:41
  • 1
    For me, it shows in `model.summary()` too – Nicolas Gervais Feb 01 '21 at 14:47
  • thats weird... are you using TF 2.4? – J-H Feb 01 '21 at 14:49
  • 1
    TF 2.3 in a conda env. You? – Nicolas Gervais Feb 01 '21 at 15:12
  • TF 2.3 in colab. Downgraded from 2.4 using !pip install. Thanks for your input, i guess then somethings wrong with my version. Hopefully i will find a solution – J-H Feb 01 '21 at 15:18
  • It was also working for me with TF 2.3 in Conda. – Frightera Feb 01 '21 at 15:30
  • after setting `trainable` to true, did you update `base = efficientnet(augmented, training=True)` ? – nipun Feb 01 '21 at 15:48
  • No I didn't do that. In e.g. https://www.tensorflow.org/tutorials/images/transfer_learning they didnt do that too. I thought this is just a parameter indicating that the batchnorm layers are kept in inference mode? – J-H Feb 01 '21 at 16:29

1 Answers1

0

For me the problem was using sequential API for part of the model. When I change to sequential, my model.sumary() displayed all the sublayers and it was possible to set some of them as trainable and others not.

Emil Haas
  • 484
  • 3
  • 17