In TensorFlow, is it possible to use different learning rate for different part of the network?

Question

The use case I have in mind is to add more layers to a pre-trained network, and I would like to tune the entire net. However, I'd like the newly added layers to have a bigger learning rate than the existing one. Is it possible to do this in TensorFlow?

Possible duplicate of [How to set layer-wise learning rate in Tensorflow?](https://stackoverflow.com/questions/34945554/how-to-set-layer-wise-learning-rate-in-tensorflow) — almightyGOSU, Dec 06 '17 at 10:09

score 2 · Accepted Answer · edited May 23 '17 at 11:53

2

You could use a similar approach mentioned here

Basically set a different var scope around each of the parts of the network you want to train with a separate learning rate then:

optimizer1 = tf.train.AdagradOptimzer(0.0001)
optimizer2 = tf.train.AdagradOptimzer(0.01)

first_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                 "scope/prefix/for/first/vars")
first_train_op = optimizer1.minimize(cost, var_list=first_train_vars)

second_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                  "scope/prefix/for/second/vars")                     
second_train_op = optimizer2.minimize(cost, var_list=second_train_vars)

edited May 23 '17 at 11:53

Community

1
1

answered Sep 30 '16 at 14:16

Steven

5,134
2
27
38

Do I have to call both optimizer individually during training? `loss_value, _ = sess.run([reduced_loss, optimizer1])` and then `loss_value, _ = sess.run([reduced_loss, optimizer2])` ? – mcExchange Jan 27 '17 at 16:14
1

You can also do this loss_value, _ = sess.run([reduced_loss, optimizer1, optimizer2]) but you do need to run both optimizers for them to have an affect. – Steven Jan 27 '17 at 17:17
I wonder what will be the order in which optimizer1 and optimizer2 will be called in case `sess.run([reduced_loss, optimizer1, optimizer2])`? seems order is not guaranteed https://stackoverflow.com/questions/43844510/is-session-runfetches-guaranteed-to-execute-its-fetches-arguments-in-order – mrgloom Sep 25 '19 at 13:31

In TensorFlow, is it possible to use different learning rate for different part of the network?

1 Answers1