L2 regularization in caffe, conversion from lasagne

Question

I have a lasagane code. I want to create the same network using caffe. I could convert the network. But i need help with the hyperparameters in lasagne. The hyperparameters in lasagne look like:

lr = 1e-2
weight_decay = 1e-5

prediction = lasagne.layers.get_output(net['out'])
loss = T.mean(lasagne.objectives.squared_error(prediction, target_var))

weightsl2 = lasagne.regularization.regularize_network_params(net['out'], lasagne.regularization.l2)
loss += weight_decay * weightsl2

How do i perform the L2 regularization part in caffe? Do I have to add any layer for regularization after each convolution/inner-product layer? Relevant parts from my solver.prototxt is as below:

base_lr: 0.01
lr_policy: "fixed"
weight_decay: 0.00001
regularization_type: "L2"
stepsize: 300
gamma: 0.1  
max_iter: 2000
momentum: 0.9

also posted in http://datascience.stackexchange.com. Waiting for answers.

please do not post duplicate questions in multiple stackexchange sites. — Shai, Jan 11 '17 at 08:42
posted on datascience, waited for answers, got no reply, then I posted on stackoverflow. I will avoid multiple posting henceforth. — user27665, Jan 11 '17 at 08:48

score 2 · Accepted Answer · edited May 23 '17 at 12:24

It seems like you already got it right.
The weight_decay meta-parameter combined with regularization_type: "L2" in your 'solver.prototxt' tell caffe to use L2 regularization with weight_decay = 1e-5.

One more thing you might want to tweak is how much regularization affect each parameter. You can set this for each parameter blob in the net via

param { decay_mult: 1 }

For example, an "InnerProduct" layer with bias has two parameters:

layer {
  type: "InnerProduct"
  name: "fc1"
  # bottom and top here
  inner_product_param { 
    bias_term: true
    # ... other params
  }
  param { decay_mult: 1 } # for weights use regularization
  param { decay_mult: 0 } # do not regularize the bias
}

By default, decay_mult is set to 1, that is, all weights of the net are regularized the same. You can change that to regularize more/less specific parameter blobs.

Do you know how to select max norm regularization? I just can found L1 and L2 type — John, Jul 29 '17 at 15:12
@user8264 AFAIK Caffe does not yet have max norm regularization. — Shai, Jul 29 '17 at 20:38

L2 regularization in caffe, conversion from lasagne

1 Answers1