Tensorflow ReLU output activation returns NaN

Question

I have a yolo-like network architecture, where on the output layer I want to predict bounding boxes with coordinates such as x,y,width,height. When I use a linear activation function everything works fine, but my model sometimes predicts negative values which dont make sense in my case, as all values to predict are between 0 and 1 for x,y and are 3 or 5 for width and height. I thought I could instead use a ReLU activation for my output but if I do my network gets stuck with NaN as a loss value.

Any ideas to why that could be ?

It is crucial to find the last iteration that did not produce NaN's. Then check which of the following becomes NaN first: (1) Layer output, (2) Loss, or (3) Gradients. This should give you an idea what is wrong and help you debug further. — André, Nov 14 '22 at 13:05

score 0 · Accepted Answer · answered Nov 14 '22 at 13:04

0

It is hard to say without you giving more data to us. However, it seems to be a common problem in cases where the input data is not normalized correctly.

Here are some links to maybe look into. If it does not help you will probably have to give more info for someone to give a usefull answer

https://discuss.tensorflow.org/t/getting-nan-for-loss/4826

NaN loss when training regression network

answered Nov 14 '22 at 13:04

Guapz

76
6

Thanks this is quite helpful, I will look into this – darmstadt beste Nov 14 '22 at 14:43

Tensorflow ReLU output activation returns NaN

1 Answers1