2

I know it's possible to have a learning rate per layer (link). I also found how to dynamically change the learning rate (changing it in the middle of training dynamically without a scheduler) (link).

How can I create an optimizer that will have a dynamic learning rate per neuron? So that I could change the value of the learning rate for specific neurons during training

As an example, if my network is as follows:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()

        self.fc1 = nn.Linear(3,5)
        self.fc2 = nn.Linear(5,10)
        self.fc3 = nn.Linear(10,1)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.relu(self.fc3(x))
        return x 

There should be 5 learning rates for the first layer (one for each of the 5 neurons, where each neuron has 3 associated weights), 10 for the second layer, and 1 for the last one.

Axo
  • 33
  • 5
  • Is a fully connected layer implemented as a matrix calculation in Pytorch? `out=activation(xW+b)`? Maybe a "per-neuron" learning rate would use a vector or matrix element-wise rate multiplier instead of a scalar when updating `W`? For me the idea of "per-neuron" is a little confusing when the matrix multiplication is the whole layer. – xdhmoore Feb 13 '22 at 23:40
  • @xdhmoore I'm not sure, sorry – Axo Feb 14 '22 at 04:23

0 Answers0