Caffe - Reccurrent Neural Network - Shared weights result in NAN

Asked Mar 27 '18 at 02:36

Active Mar 27 '18 at 05:25

Viewed 22 times

I am trying to train a recurrent neural network, where the input is an image, and the output is a probability blob. Pretty simple network, with Convolution, Pooling and Relu.

I have a set of convolution/relu blocks that are repeated several times to get a sharper blob. I can train succesfully if I don't use shared weights, but if I do, the training always results in NANs. Are there any special considerations to take note of when using shared weights to prevent Nans? Could it be the learning rate I set for each Conv block? Should the learning rate for shared weights be smaller?

edited Mar 27 '18 at 05:25

Shai

111,146
38
238
371

asked Mar 27 '18 at 02:36

raaj

2,869
4
38
58

It sounds like you have "exploding gradients" in your recurrent net. use `debug_info` to get a clearer understanding of the situation. – Shai Mar 27 '18 at 05:25

Caffe - Reccurrent Neural Network - Shared weights result in NAN

0 Answers0