4

When searching for ways to implement L1 regularization in PyTorch Models, I came across this question, which is now 2 years old so i was wondering if theres anything new on this topic?

I also found this recent approach of dealing with the missing l1 function. However I don't understand how to use it for a basic NN as shown below.

class FFNNModel(nn.Module):
    def __init__(self, input_dim, output_dim, hidden_dim, dropout_rate):
        super(FFNNModel, self).__init__()
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.hidden_dim = hidden_dim
        self.dropout_rate = dropout_rate
        self.drop_layer = nn.Dropout(p=self.dropout_rate)
        self.fully = nn.ModuleList()
        current_dim = input_dim
        for h_dim in hidden_dim:
            self.fully.append(nn.Linear(current_dim, h_dim))
            current_dim = h_dim
        self.fully.append(nn.Linear(current_dim, output_dim))

    def forward(self, x):
        for layer in self.fully[:-1]:
            x = self.drop_layer(F.relu(layer(x)))
        x = F.softmax(self.fully[-1](x), dim=0)
        return x

I was hoping simply putting this before training would work:

model = FFNNModel(30,5,[100,200,300,100],0.2)
regularizer = _Regularizer(model)
regularizer = L1Regularizer(regularizer, lambda_reg=0.1)

with

out = model(inputs)
loss = criterion(out, target) + regularizer.__add_l1()

Does anyone understand how to apply these 'ready to use' classes?

Quastiat
  • 1,164
  • 1
  • 18
  • 37
  • 1
    You might want to consider adding your **"EDIT / SIMPLE SOLUTION"** as an answer to the question, instead of including it in the question body itself. – iacob Mar 13 '21 at 14:33
  • Does this answer your question? [Pytorch: how to add L1 regularizer to activations?](https://stackoverflow.com/questions/44641976/pytorch-how-to-add-l1-regularizer-to-activations) – iacob Mar 13 '21 at 14:34
  • 1
    Please do not include the answer in the question; post a separate answer instead. – desertnaut Mar 15 '21 at 00:45

4 Answers4

8

I haven't run the code in question, so please reach back if something doesn't exactly work. Generally, I would say that the code you linked is needlessly complicated (it may be because it tries to be generic and allow all the following kinds of regularization). The way it is to be used is, I suppose

model = FFNNModel(30,5,[100,200,300,100],0.2)
regularizer = L1Regularizer(model, lambda_reg=0.1)

and then

out = model(inputs)
loss = criterion(out, target) + regularizer.regularized_all_param(0.)

You can check that regularized_all_param will just iterate over parameters of your model and if their name ends with weight, it will accumulate their sum of absolute values. For some reason the buffer is to be manually initialized, that's why we pass in the 0..

Really though, if you wish to efficiently regularize L1 and don't need any bells and whistles, the more manual approach, akin to your first link, will be more readable. It would go like this

l1_regularization = 0.
for param in model.parameters():
    l1_regularization += param.abs().sum()
loss = criterion(out, target) + l1_regularization

This is really what is at heart of both approaches. You use the Module.parameters method to iterate over all model parameters and you sum up their L1 norms, which then becomes a term in your loss function. That's it. The repo you linked comes up with some fancy machinery to abstract it away but, judging by your question, fails :)

Jatentaki
  • 11,804
  • 4
  • 41
  • 37
1

SIMPLE SOLUTION for anyone stumbling over this:

There were always some issues with the Regularizer_ classes in the link above so i solved the issue using regular functions, adding an orthogonal regularizer as well:

def l1_regularizer(model, lambda_l1=0.01):
    lossl1 = 0
    for model_param_name, model_param_value in model.named_parameters():
            if model_param_name.endswith('weight'):
                lossl1 += lambda_l1 * model_param_value.abs().sum()
    return lossl1    
        
def orth_regularizer(model, lambda_orth=0.01):
    lossorth = 0
    for model_param_name, model_param_value in model.named_parameters():
            if model_param_name.endswith('weight'):
                param_flat = model_param_value.view(model_param_value.shape[0], -1)
                sym = torch.mm(param_flat, torch.t(param_flat))
                sym -= torch.eye(param_flat.shape[0])
                lossorth += lambda_orth * sym.sum()
    return lossorth  

and during training do:

loss = criterion(outputs, y_data)\
      +l1_regularizer(model, lambda_l1=lambda_l1)\
      +orth_regularizer(model, lambda_orth=lambda_orth)   
Quastiat
  • 1,164
  • 1
  • 18
  • 37
0

You can apply L1 regularization to the loss function with the following code:

loss = loss_fn(outputs, labels)
l1_lambda = 0.001
l1_norm = sum(p.abs().sum() for p in model.parameters())

loss = loss + l1_lambda*l1_norm

Source: Deep Learning with PyTorch (8.5.2)

iacob
  • 20,084
  • 6
  • 92
  • 119
0

Equivalent to the overly complicated regularizer code from the module you referenced:

l1_loss = lambda_reg * sum([weight.abs().sum()
  for name, weight in model.named_parameters()
   if name.endswith('weight')])
loss = criterion(out, target) + l1_loss
Tiana
  • 750
  • 6
  • 14