I have the following simple model for transfer learning with a pretrained model (VGG16) without the FC layers, followed by a few new layers, defined with keras
sequential API.
IMG_SHAPE = (224, 224, 3)
# vgg16
pretrained_model = tf.keras.applications.vgg16.VGG16(
weights='imagenet',
include_top=False,
input_shape=IMG_SHAPE,
)
# freeze pretrained layers
pretrained_model.trainable = False
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(3, activation='softmax'),
])
Notice the summary of the model does not show the internal layers for VGG16
:
model.summary()
#Model: "sequential"
#_________________________________________________________________
# Layer (type) Output Shape Param #
#=================================================================
# vgg16 (Functional) (None, 4, 4, 512) 14714688
#
# batch_normalization (BatchN (None, 4, 4, 512) 2048
# ormalization)
#
# flatten (Flatten) (None, 8192) 0
# dense (Dense) (None, 2) 16386
#=================================================================
#Total params: 14,733,122
#Trainable params: 17,410
#Non-trainable params: 14,715,712
I have trained the above model on my custom dataset and got the desired accuracy on my test dataset with transfer learning.
Now, let's say I want to create a new model (e.g., to compute the activation map) that accepts
the inputs as the input to the previous model, and as outputs, I want an intermediate output (by extracting the features at a convolution layer, e.g., block5_conv3
, of the pretrained model) along with the output from the previous model. That's something where I am getting stuck and I am getting errors. For example, I have defined the new model like the following:
grad_model = tf.keras.models.Model(
[pretrained_model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
where I am getting the following error:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='vgg16_input'), name='vgg16_input', description="created by layer 'vgg16_input'") at layer "vgg16". The following previous layers were accessed without issue: ['block1_conv1', 'block1_conv2', 'block1_pool', 'block2_conv1', 'block2_conv2', 'block2_pool', 'block3_conv1', 'block3_conv2', 'block3_conv3', 'block3_pool', 'block4_conv1', 'block4_conv2', 'block4_conv3']
or like the following:
grad_model = tf.keras.models.Model(
[model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
where I am getting the following error:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'vgg16'") at layer "block1_conv1". The following previous layers were accessed without issue: []
I have also tried to set the names of the input layer of the model and the pretrained model nested inside so that the input layer names are the same:
pretrained_model.layers[0]._name = model.layers[0]._name
but getting the same error.
I think the model structure can be changed (e.g., using keras
functional API, etc.), to define the grad_model
, but not sure how to. Also, I am more interested to know if there is a way to resolve the issues without changing the model structure / without requiring me to retrain.