5

I have a simple neural network for which I am trying to plot the gradients using tensorboard by using a callback as below:

class GradientCallback(tf.keras.callbacks.Callback):
    console = False
    count = 0
    run_count = 0

    def on_epoch_end(self, epoch, logs=None):
        weights = [w for w in self.model.trainable_weights if 'dense' in w.name and 'bias' in w.name]
        self.run_count += 1
        run_dir = logdir+"/gradients/run-" + str(self.run_count)
        with tf.summary.create_file_writer(run_dir).as_default(),tf.GradientTape() as g:
          # use test data to calculate the gradients
          _x_batch = test_images_scaled_reshaped[:100]
          _y_batch = test_labels_enc[:100]
          g.watch(_x_batch)
          _y_pred = self.model(_x_batch)  # forward-propagation
          per_sample_losses = tf.keras.losses.categorical_crossentropy(_y_batch, _y_pred) 
          average_loss = tf.reduce_mean(per_sample_losses) # Compute the loss value
          gradients = g.gradient(average_loss, self.model.weights) # Compute the gradient

        for t in gradients:
          tf.summary.histogram(str(self.count), data=t)
          self.count+=1
          if self.console:
                print('Tensor: {}'.format(t.name))
                print('{}\n'.format(K.get_value(t)[:10]))

# Set up logging
!rm -rf ./logs/ # clear old logs
from datetime import datetime
import os
root_logdir = "logs"
run_id = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = os.path.join(root_logdir, run_id)


# register callbacks, this will be used for tensor board latter
callbacks = [
    tf.keras.callbacks.TensorBoard( log_dir=logdir, histogram_freq=1, 
                                   write_images=True, write_grads = True ),
    GradientCallback()
]

Then,I use the callbacks during fit as:

network.fit(train_pipe, epochs = epochs,batch_size = batch_size, validation_data = val_pipe, callbacks=callbacks)

Now, when I check the tensorboard, I get to see gradients on the left side filter but nothing shows up in the Histogram tab:

Histogram tensorboard gradients

What am I missing here? Am I logging the gradients correctly?

bit
  • 4,407
  • 1
  • 28
  • 50
  • Is your question resolved now ? Else, please check [this](https://stackoverflow.com/a/64062465/14290681),may helps you. –  Sep 28 '20 at 09:12
  • 1
    That link uses Tensorflow 1.x – bit Oct 01 '20 at 08:43
  • Hi, I am trying to do the same thing: did u come to grips with this? Any solution? I am trying to do the same thing there is not much around about it. – roschach Nov 26 '20 at 08:29
  • Hey sorry, haven't found any solution to this. Wish there was better documentation around it. – bit Nov 30 '20 at 03:24

1 Answers1

-1

It looks like the issue is that you write your histograms outside the context of the tf summary writer. I changed your code accordingly. But I didn't try it out.

class GradientCallback(tf.keras.callbacks.Callback):
    console = False
    count = 0
    run_count = 0

    def on_epoch_end(self, epoch, logs=None):
        weights = [w for w in self.model.trainable_weights if 'dense' in w.name and 'bias' in w.name]
        self.run_count += 1
        run_dir = logdir+"/gradients/run-" + str(self.run_count)
        with tf.summary.create_file_writer(run_dir).as_default()
          with tf.GradientTape() as g:
            # use test data to calculate the gradients
            _x_batch = test_images_scaled_reshaped[:100]
            _y_batch = test_labels_enc[:100]
            g.watch(_x_batch)
            _y_pred = self.model(_x_batch)  # forward-propagation
            per_sample_losses = tf.keras.losses.categorical_crossentropy(_y_batch, _y_pred) 
            average_loss = tf.reduce_mean(per_sample_losses) # Compute the loss value
            gradients = g.gradient(average_loss, self.model.weights) # Compute the gradient

          for nr, grad in enumerate(gradients):
            tf.summary.histogram(str(nr), data=grad)
            if self.console:
                  print('Tensor: {}'.format(grad.name))
                  print('{}\n'.format(K.get_value(grad)[:10])) 

J.G
  • 1
  • Getting `ValueError: Passed in object of type , not tf.Tensor` error – shaik moeed Feb 16 '21 at 09:13
  • @shaikmoeed most likely your data (_x_batch, _y_batch) is a numpy array not a tf.Tensor. You need to convert it before you can use it in this context. The solution I proposed only fixes the problem of the gradients not being sent to tensorboard. Any other problems with the code are out of scope for the answer. – J.G Feb 17 '21 at 10:38
  • I would like similar behavior and trying to use your code but do not understand where test_images_scaled_reshaped[:100] is coming from? Since the function is applied at epoch end wouldn't you want to get the train data corresponding to that epoch? – funmath May 17 '21 at 18:52