How to monitor gradient vanish and explosion in keras with tensorboard?

Question

I would like to monitor the gradient changes in tensorboard with keras to decide whether gradient vanish or explosion. What should I do?

score 26 · Accepted Answer · answered Apr 26 '18 at 16:43

26

To visualize the training in Tensorboard, add keras.callbacks.TensorBoard callback to model.fit function. Don't forget to set write_grads=True to see the gradients there. Right after training start, you can run...

tensorboard --logdir=/full_path_to_your_logs

... from the command line and point your browser to htttp://localhost:6006. See the example code in this question.

To check for vanishing / exploding gradients, pay attention the gradients distribution and absolute values in the layer of interest ("Distributions" tab):

If the distribution is highly peaked and concentrated around 0, the gradients are probably vanishing. Here's a concrete example how it looks like in practice.
If the distribution is rapidly growing in absolute value with time, the gradients are exploding. Often the output values at the same layer become NaNs very quickly as well.

answered Apr 26 '18 at 16:43

Maxim

52,561
27
155
209

2

Also, for this to work you need to have `histogram_freq > 1` and therefore validation data (which cannot be a generator, even a sequence). – Zaccharie Ramzi Nov 06 '19 at 16:08
Setting write_grads=True does not show gradients for me. Any suggestions? – J.J. Feb 11 '20 at 23:42
4

@J.J. write_grads=True is currently deprecated in the latest versions of tensorflow (2.1.0) https://github.com/tensorflow/tensorflow/issues/31173 I'm looking for a solution at the moment as well. – Elegant Code Feb 18 '20 at 23:47
1

@ElegantCode Did you find any solutions?? – ch271828n Apr 07 '20 at 03:03
1

@ch271828n , The only answer I keep getting is to write your own callbacks to save the gradients and visualize them yourself. – Elegant Code Apr 07 '20 at 19:19
1

@ElegantCode Could you please provide some sample code / repo? Thanks! – ch271828n Apr 08 '20 at 08:28
Maybe the code here can be adapted for Feed forward networks as well. https://stackoverflow.com/questions/59017288/how-to-visualize-rnn-lstm-gradients-in-keras-tensorflow – Elegant Code Apr 08 '20 at 23:38
@FrancescoBoi , Unfortunately not. You either write your own custom callback (https://keras.io/examples/vision/visualizing_what_convnets_learn/ OR https://keras.io/examples/vision/integrated_gradients/ ) to record the gradients, or you could try to infer by using some indirect indications, for example seeing that the distribution of weights in TensorBoard changes a bit throughout the NN, or that the loss steadily decreases. But at the end of the day, these are approximations, with many issues. – Elegant Code Jun 22 '20 at 21:13
You mean writing a new callback with `on_epoch_end` or subclassing the `keras.callbacks.TensorBoard()` callback or subclassing the optimizer (where the gradient is computed)? Sorry but the information about this is scattered and I can't come to grips. – roschach Jun 23 '20 at 08:24

How to monitor gradient vanish and explosion in keras with tensorboard?

1 Answers1