7

I am starting with Tensorflow 2.0 and trying to implement Guided BackProp to display Saliency Map. I started by computing the loss between y_pred and y_true of an image, then find gradients of all layers due to this loss.

with tf.GradientTape() as tape:
    logits = model(tf.cast(image_batch_val, dtype=tf.float32))
    print('`logits` has type {0}'.format(type(logits)))
    xentropy = tf.nn.softmax_cross_entropy_with_logits(labels=tf.cast(tf.one_hot(1-label_batch_val, depth=2), dtype=tf.int32), logits=logits)
    reduced = tf.reduce_mean(xentropy)
    grads = tape.gradient(reduced, model.trainable_variables)

However, I don't know what to do with gradients in order to obtain the Guided Propagation.

This is my model. I created it using Keras layers:

image_input = Input((input_size, input_size, 3))

conv_0 = Conv2D(32, (3, 3), padding='SAME')(image_input)
conv_0_bn = BatchNormalization()(conv_0)
conv_0_act = Activation('relu')(conv_0_bn)
conv_0_pool = MaxPool2D((2, 2))(conv_0_act)

conv_1 = Conv2D(64, (3, 3), padding='SAME')(conv_0_pool)
conv_1_bn = BatchNormalization()(conv_1)
conv_1_act = Activation('relu')(conv_1_bn)
conv_1_pool = MaxPool2D((2, 2))(conv_1_act)

conv_2 = Conv2D(64, (3, 3), padding='SAME')(conv_1_pool)
conv_2_bn = BatchNormalization()(conv_2)
conv_2_act = Activation('relu')(conv_2_bn)
conv_2_pool = MaxPool2D((2, 2))(conv_2_act)

conv_3 = Conv2D(128, (3, 3), padding='SAME')(conv_2_pool)
conv_3_bn = BatchNormalization()(conv_3)
conv_3_act = Activation('relu')(conv_3_bn)

conv_4 = Conv2D(128, (3, 3), padding='SAME')(conv_3_act)
conv_4_bn = BatchNormalization()(conv_4)
conv_4_act = Activation('relu')(conv_4_bn)
conv_4_pool = MaxPool2D((2, 2))(conv_4_act)

conv_5 = Conv2D(128, (3, 3), padding='SAME')(conv_4_pool)
conv_5_bn = BatchNormalization()(conv_5)
conv_5_act = Activation('relu')(conv_5_bn)

conv_6 = Conv2D(128, (3, 3), padding='SAME')(conv_5_act)
conv_6_bn = BatchNormalization()(conv_6)
conv_6_act = Activation('relu')(conv_6_bn)

flat = Flatten()(conv_6_act)

fc_0 = Dense(64, activation='relu')(flat)
fc_0_bn = BatchNormalization()(fc_0)

fc_1 = Dense(32, activation='relu')(fc_0_bn)
fc_1_drop = Dropout(0.5)(fc_1)

output = Dense(2, activation='softmax')(fc_1_drop)

model = models.Model(inputs=image_input, outputs=output)

I am glad to provide more code if needed.

Tai Christian
  • 654
  • 1
  • 10
  • 21
  • 2
    Could you please detail which gradients you want to obtain, exactly? What is your desired result? – rvinas May 06 '19 at 13:34
  • Thank you @rvinas My final goal is to show how much each region in the input image contribute to the outputs of a CNN, like the result in the paper https://arxiv.org/abs/1412.6806 – Tai Christian May 14 '19 at 10:29

2 Answers2

6

First of all, you have to change the computation of the gradient through a ReLU, i.e. Guided BackProp Formula

Here a graphic example from the paper.Graphical example

This formula can be implemented with the following code:

@tf.RegisterGradient("GuidedRelu")
def _GuidedReluGrad(op, grad):
   gate_f = tf.cast(op.outputs[0] > 0, "float32") #for f^l > 0
   gate_R = tf.cast(grad > 0, "float32") #for R^l+1 > 0
   return gate_f * gate_R * grad

Now you have to override the original TF implementation of ReLU with:

with tf.compat.v1.get_default_graph().gradient_override_map({'Relu': 'GuidedRelu'}):
   #put here the code for computing the gradient

After computing the gradient, you can visualize the result. However, one last remark. You compute a visualization for a single class. This means, you take the activation of a choosen neuron and set all the activations of the other neurons to zero for the input of Guided BackProp.

Simdi
  • 794
  • 4
  • 13
  • @Simidi thank you for your answer! I think it should work perfectly in Tensorflow 1. However, as I mentioned in my question, I am trying to implement this in Tensorflow 2 in which 'session and graph are no more used. How could I modify your answer to make it work in tensorflow 2? – Tai Christian May 14 '19 at 12:04
  • 1
    if I see [this](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/Graph#gradient_override_map) correctly, you should be able to use the code in tensorflow 2. Correct me if I am wrong – Simdi May 14 '19 at 18:24
  • Thank you very much! It works,. the tf.get_default_graph() becomes tf.compat.v1.get_default_graph() in TF2.0 – Tai Christian Jul 08 '19 at 08:57
  • Sorry for silly question, but after receiving the gradients, what else do I need to do to get the final saliency map? – Tai Christian Jul 08 '19 at 09:41
  • 1
    there are several options. You can use absolute value of the gradient or you can square the gradients. After that you can normalize it and then plot it – Simdi Jul 09 '19 at 18:37
6

I tried @tf.RegisterGradient and gradient_override_map as @Simdi suggested but it was not effective with TF2. I am not sure if I was wrong in any steps but it seems that Relu has not been replaced by GuidedRelu. I think it is because: "There is no built-in mechanism in TensorFlow 2.0 to override all gradients for a built-in operator within a scope." as answered by mrry in this discussion: https://stackoverflow.com/a/55799378/11524628

I used @tf.custom_gradient as mrry said and it worked perfectly for me:

@tf.custom_gradient
def guidedRelu(x):
  def grad(dy):
    return tf.cast(dy>0,"float32") * tf.cast(x>0, "float32") * dy
  return tf.nn.relu(x), grad

model = tf.keras.applications.resnet50.ResNet50(weights='imagenet', include_top=True)
gb_model = Model(
    inputs = [model.inputs],
    outputs = [model.get_layer("conv5_block3_out").output]
)
layer_dict = [layer for layer in gb_model.layers[1:] if hasattr(layer,'activation')]
for layer in layer_dict:
  if layer.activation == tf.keras.activations.relu:
    layer.activation = guidedRelu

with tf.GradientTape() as tape:
  inputs = tf.cast(preprocessed_input, tf.float32)
  tape.watch(inputs)
  outputs = gb_model(inputs)

grads = tape.gradient(outputs,inputs)[0]

You can see the implementation with two methods above in this Google Colab Notebook: https://colab.research.google.com/drive/17tAC7xx2IJxjK700bdaLatTVeDA02GJn?usp=sharing

  • @tf.custom_gradient worked
  • @tf.RegisterGradient didn't work as relu not overridden with the registered GuidedRelu.
Hoa Nguyen
  • 470
  • 6
  • 15
  • 1
    Thanks for sharing your implementation. Worked for me as well! – Jimi Oke Jun 14 '21 at 15:09
  • Don't you need to add the class label? – mCalado Sep 03 '21 at 21:20
  • 1
    Guided Backpropagation doesn't require the class label. When you want to visualize Guided-GradCAM, yes, we need a class label! – Hoa Nguyen Sep 04 '21 at 00:08
  • Okay, I just saw the Captum (pyTorch) and innvestigate (TF 1.0) versions of it GBP and they all require some label, so that's why I asked. Another question that I got is whether there's a different to use the models output and the last conv layer – mCalado Sep 04 '21 at 00:11