18

I would like to monitor the gradient changes in tensorboard with keras to decide whether gradient vanish or explosion. What should I do?

Maxim
  • 52,561
  • 27
  • 155
  • 209
Joey Chia
  • 195
  • 1
  • 9

1 Answers1

26

To visualize the training in Tensorboard, add keras.callbacks.TensorBoard callback to model.fit function. Don't forget to set write_grads=True to see the gradients there. Right after training start, you can run...

tensorboard --logdir=/full_path_to_your_logs

... from the command line and point your browser to htttp://localhost:6006. See the example code in this question.

To check for vanishing / exploding gradients, pay attention the gradients distribution and absolute values in the layer of interest ("Distributions" tab):

  • If the distribution is highly peaked and concentrated around 0, the gradients are probably vanishing. Here's a concrete example how it looks like in practice.
  • If the distribution is rapidly growing in absolute value with time, the gradients are exploding. Often the output values at the same layer become NaNs very quickly as well.
Maxim
  • 52,561
  • 27
  • 155
  • 209
  • 2
    Also, for this to work you need to have `histogram_freq > 1` and therefore validation data (which cannot be a generator, even a sequence). – Zaccharie Ramzi Nov 06 '19 at 16:08
  • Setting write_grads=True does not show gradients for me. Any suggestions? – J.J. Feb 11 '20 at 23:42
  • 4
    @J.J. write_grads=True is currently deprecated in the latest versions of tensorflow (2.1.0) https://github.com/tensorflow/tensorflow/issues/31173 I'm looking for a solution at the moment as well. – Elegant Code Feb 18 '20 at 23:47
  • 1
    @ElegantCode Did you find any solutions?? – ch271828n Apr 07 '20 at 03:03
  • 1
    @ch271828n , The only answer I keep getting is to write your own callbacks to save the gradients and visualize them yourself. – Elegant Code Apr 07 '20 at 19:19
  • 1
    @ElegantCode Could you please provide some sample code / repo? Thanks! – ch271828n Apr 08 '20 at 08:28
  • Maybe the code here can be adapted for Feed forward networks as well. https://stackoverflow.com/questions/59017288/how-to-visualize-rnn-lstm-gradients-in-keras-tensorflow – Elegant Code Apr 08 '20 at 23:38
  • @FrancescoBoi , Unfortunately not. You either write your own custom callback (https://keras.io/examples/vision/visualizing_what_convnets_learn/ OR https://keras.io/examples/vision/integrated_gradients/ ) to record the gradients, or you could try to infer by using some indirect indications, for example seeing that the distribution of weights in TensorBoard changes a bit throughout the NN, or that the loss steadily decreases. But at the end of the day, these are approximations, with many issues. – Elegant Code Jun 22 '20 at 21:13
  • You mean writing a new callback with `on_epoch_end` or subclassing the `keras.callbacks.TensorBoard()` callback or subclassing the optimizer (where the gradient is computed)? Sorry but the information about this is scattered and I can't come to grips. – roschach Jun 23 '20 at 08:24