2

I'm trying to use TensorFlow to minimize the below loss function (L) with respect to u. There are 3 variables, u, x_opt, L, with the following dependency graph:

u ---(f)--> x_opt ---(g)--> L,

with the exact form of the dependency governed by functions f and g.

def f(u):

    def f_helper(u,x):
        # with u held fixed, f_helper is a convex function of x
        # the exact form of f_helper does not matter
        return np.linalg.norm(x-u)

    curried_f_helper = lambda x: f_helper(u,x)
    x_opt = scipy.optimize(curried_f_helper,np.random.uniform(5))['x']
    return x_opt

def g(x_opt):
    # the exact form of g does not matter
    return np.ones(x_opt.shape).dot(x_opt)

def L(u):
    # want to optimize L over u
    x_opt = f(u)
    return g(x_opt)

# use TensorFlow to minimize L over u...

The complication is that f() does not have an analytical functional form - u parameterizes an optimization problem whose solution is x_opt. So TensorFlow would not be able to compute the gradient of f with respect to u. However, I can use implicit differentiation to manually compute this gradient. Ideally, I'd be able to define a new op representing f, and register its gradient (that I manually calculate).

My question is: How should I implement the op representing f and specify its gradient? Is it possible to define the op for f using only Python, and if so, will I have to use tf.pyfunc?

  • Am I correct in understanding that your question isn't really about implicit differentiation (that's just an implementation detail), but it's really about manually specifying a gradient function? – Josephine Moeller Jun 11 '16 at 22:17
  • If that's the case, it looks like it can be done with `Graph.gradient_override_map` [see this issue here](https://github.com/tensorflow/tensorflow/issues/1095), but it's ugly. – Josephine Moeller Jun 11 '16 at 22:21
  • Interesting, it looks like there's a hack in [this answer](http://stackoverflow.com/a/36480182/716440) – Josephine Moeller Jun 11 '16 at 22:26
  • Hi @JohnMoeller. Right, I've changed the title. That might take care of specifying the gradient (I need to read more), but what would be the easiest way to define the op? If given `u`, I could calculate, using only TensorFlow ops (like Minimizers) the value of `f(u)`, but I'm confused where to place this logic. I want to do something like "define a new op, x_opt, that is the result of running 100 gradient descent steps over `x` on `f_helper` with `u` fixed." – Fulton Wang Jun 11 '16 at 23:00
  • That I don't know. The design of TF doesn't seem to accomodate custom gradients well. – Josephine Moeller Jun 11 '16 at 23:01
  • I'm not sure I totally understand your question, but maybe my answers to the following two questions are an answer to your question: http://stackoverflow.com/questions/39048984/tensorflow-how-to-write-op-with-gradient-in-python/39984513#39984513 http://stackoverflow.com/questions/39921607/tensorflow-how-to-make-a-custom-activation-function-with-only-python/39921608#39921608 – patapouf_ai Oct 11 '16 at 19:06

0 Answers0