2

I am trying to implement the "feed-forward convolutional/deconvolutional residual encoder" as described in this paper: https://arxiv.org/abs/1511.06085

In the network architecture they use a binarization layer, where they first use a standard fully-connected layer with tanh-activation to produce a vector with components in the continuous interval [-1,1]. Then they probabilistically map each component to either -1 or 1. The problem now is that backpropagation is not trivially applied for this second step. After some reasoning they say that they pass the gradients through this stage unchanged.

Now my question is, how can I implement this in tensorflow? Is there a simply way to define custom gradients for an operation? A simple example would be very appreciated.

EDIT:

Would the following code do what I want?

def binarization(x):
    g = tf.get_default_graph()
    with ops.name_scope("Binarization") as name:
        with g.gradient_override_map({"Ceil": "Identity",
                                      "Sub": "CustomGrad",
                                      "Div": "CustomGrad",
                                      "Add": "CustomGrad",
                                      "Mul": "CustomGrad"}):
            scaled_x = (x + 1) / 2
            binary_x = tf.ceil(scaled_x - tf.random_uniform(tf.shape(x)), name=name)
            return (binary_x * 2) - 1

@ops.RegisterGradient("CustomGrad")
def customGrad(op, grad):
    return [grad, tf.zeros(tf.shape(op.inputs[1]))]
  • Here I have shown how to define custom gradients with examples: https://stackoverflow.com/questions/43839431/tensorflow-how-to-replace-or-modify-gradient – BlueSun Nov 27 '17 at 17:40
  • @BlueSun Thank you very much! The example helped me to come up with some code. Could you have a look at it and tell me if I'm doing OK? Thanks :) (see edit) – Stettler Vincent Nov 27 '17 at 20:21

0 Answers0