caffe training loss does not converge

Asked Dec 20 '16 at 03:13

Active Dec 20 '16 at 03:13

Viewed 907 times

I'm getting the problem of non-converged training loss. (batchsize: 16, average loss:10). I have tried with the following methods + Vary the learning rate lr (initial lr = 0.002 cause very high loss, around e+10). Then with lr = e-6, the loss seem to small but do not converge. + Add initialization for bias + Add regularization for bias and weight

This is the network structure and the training loss log

Hope to hear from you Best regards

asked Dec 20 '16 at 03:13

Huynh Vu

try learning rate between 0 002 and e-6 – Shai Dec 20 '16 at 05:54
I have tried with different learning rate values such as 2e-3, 2e-4,2e-5. Those values caused very high loss (e+21). – Huynh Vu Dec 20 '16 at 09:06
1

You need to debug your training process. set `debug_info: true` in your `'solver.prototxt'` and use [these guidelines](http://stackoverflow.com/q/40510706/1714410) to see what is interfering with the training. – Shai Dec 20 '16 at 09:09
1

Thank Shai for the suggestion – Huynh Vu Dec 23 '16 at 09:17

caffe training loss does not converge

0 Answers0

Linked