3

I have a lasagane code. I want to create the same network using caffe. I could convert the network. But i need help with the hyperparameters in lasagne. The hyperparameters in lasagne look like:

lr = 1e-2
weight_decay = 1e-5

prediction = lasagne.layers.get_output(net['out'])
loss = T.mean(lasagne.objectives.squared_error(prediction, target_var))

weightsl2 = lasagne.regularization.regularize_network_params(net['out'], lasagne.regularization.l2)
loss += weight_decay * weightsl2

How do i perform the L2 regularization part in caffe? Do I have to add any layer for regularization after each convolution/inner-product layer? Relevant parts from my solver.prototxt is as below:

base_lr: 0.01
lr_policy: "fixed"
weight_decay: 0.00001
regularization_type: "L2"
stepsize: 300
gamma: 0.1  
max_iter: 2000
momentum: 0.9

also posted in http://datascience.stackexchange.com. Waiting for answers.

user27665
  • 673
  • 7
  • 27
  • 1
    please do not post duplicate questions in multiple stackexchange sites. – Shai Jan 11 '17 at 08:42
  • 2
    posted on datascience, waited for answers, got no reply, then I posted on stackoverflow. I will avoid multiple posting henceforth. – user27665 Jan 11 '17 at 08:48

1 Answers1

2

It seems like you already got it right.
The weight_decay meta-parameter combined with regularization_type: "L2" in your 'solver.prototxt' tell caffe to use L2 regularization with weight_decay = 1e-5.

One more thing you might want to tweak is how much regularization affect each parameter. You can set this for each parameter blob in the net via

param { decay_mult: 1 }

For example, an "InnerProduct" layer with bias has two parameters:

layer {
  type: "InnerProduct"
  name: "fc1"
  # bottom and top here
  inner_product_param { 
    bias_term: true
    # ... other params
  }
  param { decay_mult: 1 } # for weights use regularization
  param { decay_mult: 0 } # do not regularize the bias
}

By default, decay_mult is set to 1, that is, all weights of the net are regularized the same. You can change that to regularize more/less specific parameter blobs.

Community
  • 1
  • 1
Shai
  • 111,146
  • 38
  • 238
  • 371