0

I'm getting the problem of non-converged training loss. (batchsize: 16, average loss:10). I have tried with the following methods + Vary the learning rate lr (initial lr = 0.002 cause very high loss, around e+10). Then with lr = e-6, the loss seem to small but do not converge. + Add initialization for bias + Add regularization for bias and weight

This is the network structure and the training loss log

Hope to hear from you Best regards

  • try learning rate between 0 002 and e-6 – Shai Dec 20 '16 at 05:54
  • I have tried with different learning rate values such as 2e-3, 2e-4,2e-5. Those values caused very high loss (e+21). – Huynh Vu Dec 20 '16 at 09:06
  • 1
    You need to debug your training process. set `debug_info: true` in your `'solver.prototxt'` and use [these guidelines](http://stackoverflow.com/q/40510706/1714410) to see what is interfering with the training. – Shai Dec 20 '16 at 09:09
  • 1
    Thank Shai for the suggestion – Huynh Vu Dec 23 '16 at 09:17

0 Answers0