NaNs and inf During Training a model with LSTM Layers

Asked Jul 04 '21 at 20:37

Active Jul 04 '21 at 21:15

Viewed 84 times

I had a NaNs issue that prevented me from running my model for even one iteration, and the batch normalization solution in this post enabled me to run my model. But I still have some iterations that return NaNs/ Infs, and they go away after a few iterations. Is this all right?

I also noticed that the number of LSTM nodes has an effect on this result. Can anyone explain the proper way to use batch normalization and the number of nodes in LSTM layers?

The structure of my model is similar to this post and I'm just wondering where in the model I should use batch normalization. Is this batch normalization correctly implemented, or do I need to add it after each LSTM layer?

edited Jul 04 '21 at 21:15

asked Jul 04 '21 at 20:37

Ali

1

Can you please post the architecture of your model, and give more information about the data – Reda El Hail Jul 04 '21 at 20:56
1

In order to get a response on your question, you should provide qt leqst q minimum reproducible code. With the information provided until now, no one can help you. Please include your code, a description od the data used, is there any particular preprocessing... – Reda El Hail Jul 04 '21 at 21:05
@RedaElHail Thank you for your feedback. I included the code in my post. There are many complicated details, and I'm just wondering where in the model I should use batch normalization. – Ali Jul 04 '21 at 21:11

NaNs and inf During Training a model with LSTM Layers

0 Answers0