Problem
I'm running a Deep Neural Network on the MNIST where the loss defined as follow:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, label))
The program seems to run correctly until I get a nan loss in the 10000+ th minibatch. Sometimes, the program runs correctly until it finished. I think tf.nn.softmax_cross_entropy_with_logits
is giving me this error.
This is strange, because the code just contains mul
and add
operations.
Possible Solution
Maybe I can use:
if cost == "nan":
optimizer = an empty optimizer
else:
...
optimizer = real optimizer
But I cannot find the type of nan
. How can I check a variable is nan
or not?
How else can I solve this problem?