0

There is a lot of examples of py_func usage on Stackoverflow, but I just want to define gradient for my custom activation function, something like this, which uses only tensorflow native operations. Example for Identity forward pass.

Suppose I have registered gradient for my activation "OPLU" (comments illustrate my understanding so far of what's going on):

@tf.RegisterGradient("OPLUGrad")
def oplugrad(op, grad):
    x = op.inputs[0] # Need x !

    # This print should be executed if oplugrad was launched!
    # Because it was set inside the evaluation chain for output !

    x = tf.Print(x, [tf.shape(x)], message = 'debug: ')

    grad_new = x*grad # let it be, just for example

    return grad_new

And defined my layer:

def tf_oplu(x, name="OPLU"):   

     y = ...f(x)...

     # Here new op is created, as far as I understand
     with ops.op_scope([x], name, "OPLUop") as name:

         g = tf.get_default_graph() 

         # As far as I understand, here I issue command to tensorflow
         # to use "OPLUGrad" when "OPLU" activation was applied

         with g.gradient_override_map({"OPLU": "OPLUGrad"}):
            # OK, gradient assigned, now return what forward layer computes
            return y 

But I don't see any output from tf.Print inside gradient function, which means it is not executed.

Question1: How to register it properly and have these two functions in order to use embedded optimizers like AdamOptimizer?

Question2: As far as I understand, standard gradient computation is suppressed in this way. What if I want standard gradients to be computed and then do some modification without interference into Session() code with manual invocation and modification of gradients in Session() run that I've seen somewhere?

EDIT: Here is the example of code for which I want to replace tf.nn.relu with my tf_OPLU

Thank you!

Slowpoke
  • 1,069
  • 1
  • 13
  • 37
  • 1
    Possible duplicate of [How to register a custom gradient for a operation composed of tf operations](https://stackoverflow.com/questions/43256517/how-to-register-a-custom-gradient-for-a-operation-composed-of-tf-operations) – MZHm Jul 06 '17 at 21:10
  • @MZHm Yes! Your answer there is very useful, please excuse me for not finding it before. But can you please provide just a few explanations, like I tried to do in my comments? As far as I see, one needs to assign gradient to Identity and then add Identity transform at the end of custom activations computation, right? – Slowpoke Jul 06 '17 at 21:26
  • @MZHm Also, please excuse me for very vague description of my question 2, but also it is still a bit unclear for me, what operation exactly I replace when I register custom tf. gradient function: do I override the computation of the entire expression that is being added up to weights, or just the activation derivative computation? The reason is that I want to implement the latter only. – Slowpoke Jul 06 '17 at 21:35
  • @MZHm As far as I understood, `grad` parameter means vector to be multiplied by activation derivatives. – Slowpoke Jul 06 '17 at 21:45

0 Answers0