1

I want to implement C-MWP as described here: https://arxiv.org/pdf/1608.00507.pdf in keras/tensorflow. This involves modifying the way backprop is performed. The new gradient is a function of the bottom activation responses the weight parameters and the gradients of the layer above.

As a start, I was looking at the way keras-vis is doing modified backprop:

def _register_guided_gradient(name):
if name not in ops._gradient_registry._registry:
    @tf.RegisterGradient(name)
    def _guided_backprop(op, grad):
        dtype = op.outputs[0].dtype
        gate_g = tf.cast(grad > 0., dtype)
        gate_y = tf.cast(op.outputs[0] > 0, dtype)
        return gate_y * gate_g * grad

However, to implement C-MWP I need access to the weights of the layer on which the backprop is performed. Is it possible to access the weight within the @tf.RegisterGradient(name) function? Or am I on the wrong path?

PaperBuddy
  • 41
  • 3

1 Answers1

0

The gradient computation in TF is fundamentally per-operation. If the operation whose gradient you want to change is performed on the weights, or at least the weights are not far from it in the operation graph, you can try finding the weights tensor by walking the graph inside your custom gradient. For example, say you have something like

x = tf.get_variable(...)
y = 5.0 * x
tf.gradients(y, x)

You can get to the variable tensor (more precisely, the tensor produced by the variable reading operation) with something like

@tf.RegisterGradient(name)
def my_grad(op, grad):
    weights = op.inputs[1]
    ...

If the weights are not immediate inputs, but you know how to get to them, you can walk the graph a bit using something like:

@tf.RegisterGradient(name)
def my_grad(op, grad):
    weights = op.inputs[1].op.inputs[0].op.inputs[2]
    ...

You should understand that this solution is very hacky. If you control the forward pass, you might want to just define a custom gradient just for the subgraph you care about. You can see how you can do that in How to register a custom gradient for a operation composed of tf operations and How Can I Define Only the Gradient for a Tensorflow Subgraph? and https://www.tensorflow.org/api_docs/python/tf/Graph#gradient_override_map

iga
  • 3,571
  • 1
  • 12
  • 22