There is a lot of examples of py_func
usage on Stackoverflow, but I just want to define gradient for my custom activation function, something like this, which uses only tensorflow native operations. Example for Identity forward pass.
Suppose I have registered gradient for my activation "OPLU" (comments illustrate my understanding so far of what's going on):
@tf.RegisterGradient("OPLUGrad")
def oplugrad(op, grad):
x = op.inputs[0] # Need x !
# This print should be executed if oplugrad was launched!
# Because it was set inside the evaluation chain for output !
x = tf.Print(x, [tf.shape(x)], message = 'debug: ')
grad_new = x*grad # let it be, just for example
return grad_new
And defined my layer:
def tf_oplu(x, name="OPLU"):
y = ...f(x)...
# Here new op is created, as far as I understand
with ops.op_scope([x], name, "OPLUop") as name:
g = tf.get_default_graph()
# As far as I understand, here I issue command to tensorflow
# to use "OPLUGrad" when "OPLU" activation was applied
with g.gradient_override_map({"OPLU": "OPLUGrad"}):
# OK, gradient assigned, now return what forward layer computes
return y
But I don't see any output from tf.Print
inside gradient function, which means it is not executed.
Question1: How to register it properly and have these two functions in order to use embedded optimizers like AdamOptimizer?
Question2: As far as I understand, standard gradient computation is suppressed in this way. What if I want standard gradients to be computed and then do some modification without interference into Session() code with manual invocation and modification of gradients in Session() run that I've seen somewhere?
EDIT: Here is the example of code for which I want to replace tf.nn.relu with my tf_OPLU
Thank you!