In tensorflow how can I rescale gradient before it propagates to the next layer?

Question

I have a gradient entering layer L1 from layer L2_1 and L2_2 at the same time, I need to rescale gradient (L2_1 + L2_2) before it enters L1 by 1/sqrt(2). How can I do this?

My network looks something like this:

                L2_1
               /    \
input -> L0 - L1     L_final
               \    /
                L2_2

score 1 · Accepted Answer · edited May 23 '17 at 10:28

1

You can divide L2_1 and L2_2 output by sqrt(2). That will rescale both activations and backprop. If you want to modify only backprop but not activations, you can use gradient replacement trick from here

edited May 23 '17 at 10:28

Community

1
1

answered May 09 '16 at 16:31

Yaroslav Bulatov

57,332
22
139
197

`L2_1_t = 1/sqrt(2)*L2_1 L2_1_y = L2_1_t + tf.stop_gradient(L2_1 - L2_1_t)` and `L2_2_t = 1/sqrt(2)*L2_2 L2_2_y = L2_2_t + tf.stop_gradient(L2_2 - L2_2_t)` and in the model construction code I would use `L2_1_y` and `L2_2_y` in place of `L2_1`, `L2_2` (as input to the next layer), is this right? – userqwerty1 May 09 '16 at 16:47
Looks right at first glance, but feel free to update this Q if you try it and it works, since others may hit have the same request – Yaroslav Bulatov May 09 '16 at 16:55

In tensorflow how can I rescale gradient before it propagates to the next layer?

1 Answers1