2

I am trying to train a recurrent neural network, where the input is an image, and the output is a probability blob. Pretty simple network, with Convolution, Pooling and Relu.

I have a set of convolution/relu blocks that are repeated several times to get a sharper blob. I can train succesfully if I don't use shared weights, but if I do, the training always results in NANs. Are there any special considerations to take note of when using shared weights to prevent Nans? Could it be the learning rate I set for each Conv block? Should the learning rate for shared weights be smaller?

Shai
  • 111,146
  • 38
  • 238
  • 371
raaj
  • 2,869
  • 4
  • 38
  • 58
  • It sounds like you have "exploding gradients" in your recurrent net. use `debug_info` to get a clearer understanding of the situation. – Shai Mar 27 '18 at 05:25

0 Answers0