I know that it is possible to freeze single layers in a network for example to train only the last layers of a pre-trained model. What Iām looking for is a way to apply certain learning rates to different layers.
So for example a very low learning rate of 0.000001 for the first layer and then increasing the learning rate gradually for each of the following layers. So that the last layer then ends up with a learning rate of 0.01 or so.
Is this possible in pytorch? Any idea how I can archive this?