How to use pre-trained model as non trainable sub network in tensorflow?

Question

I'd like to train a network that contains a sub network that I need to stay fix during the training. The basic idea is to prepend and append some layers the the pre-trained network (inceptionV3)

new_layers -> pre-trained and fixed sub-net (inceptionv3) -> new_layers

and run the training process for the task I have without changing the pre-trained one. I also need to branch directly on some layer of the pre-trained network. For example, with the inceptionV3 I like to uses it from the conv 299x299 to the last pool layer or from the conv 79x79 to the last pool layer.

score 5 · Answer 1 · answered Feb 03 '16 at 16:19

5

Whether or not a "layer" is trained is determined by whether the variables used in that layer get updated with gradients. If you are using the Optimizer interface to optimize your network, then you can simply not pass the variables used in the layers that you want to keep fixed to the minimize function, i.e.,

opt.minimize(loss, <subset of variables you want to train>)

If you are using tf.gradients function directly, then remove the variables that you want to keep fixed from the second argument to tf.gradients.

Now, how you "branch directly" to a layer of a pre-trained network depends on how that network is implemented. I would simply locate the tf.Conv2D call to the 299x299 layer you are talking about, and pass as its input, the output of your new layer, and on the output side, locate the 79x79 layer, use its output as the input to your new layer.

answered Feb 03 '16 at 16:19

keveman

8,427
1
38
46

2

Note that one way people get the set of variables to train is by using the tf.trainable_variables() function https://www.tensorflow.org/versions/0.6.0/api_docs/python/state_ops.html#trainable_variables to get the collection of variables that were created with trainable=True (the default). You can exclude variables from that collection by passing trainable=False when constructing the variable: https://www.tensorflow.org/versions/0.6.0/api_docs/python/state_ops.html#Variable.__init__ – Josh11b Feb 03 '16 at 20:07
Here is another question related to updating variables for some layers only:http://stackoverflow.com/questions/34945554/how-to-set-layer-wise-learning-rate-in-tensorflow – mathetes Feb 03 '16 at 21:27
And how can I set the input of each layer ? Do I need to use sess.run function such as in the classify_image example ? – jrabary Feb 05 '16 at 15:42

How to use pre-trained model as non trainable sub network in tensorflow?

1 Answers1

Linked