How to get the horizontal and vertical gradient of the difference between y_true and y_pred?

Question

I want to define a custom loss function using Keras, which contains the gradient of the difference between y_true and y_pred. I found numpy.gradient can help me get the gradient of an array. So part of my code for loss function looks like this:

def loss(y_true, y_pred):
    d   = y_true - y_pred
    gradient_x = np.gradient(d, axis=0)
    gradient_y = np.gradient(d, axis=1)

but it turns out d is a Tensorflow tensor class and numpy.gradient can't process it. I'm kind of new to Keras and Tensorflow.

Is there any other function can help me do this? Or I have to compute the gradient by myself?

score 1 · Answer 1 · answered Sep 16 '21 at 09:54

I run into the same problem of wanting to define a loss function using np.gradient. I wrote a pure tensorflow version of the function to work around the problem.

Here's my version of it: (It has the same behavior as np.gradient with axis=-1.) If you want to make it work for arbitrary axis, you need to play around it a bit more:

def my_gradient_tf(a):
    rght = tf.concat((a[..., 1:], tf.expand_dims(a[..., -1], -1)), -1)
    left = tf.concat((tf.expand_dims(a[...,0], -1), a[..., :-1]), -1)
    ones = tf.ones_like(rght[..., 2:], tf.float64)
    one = tf.expand_dims(ones[...,0], -1)
    divi = tf.concat((one, ones*2, one), -1)
    return (rght-left) / divi

score 0 · Answer 2 · answered Oct 01 '18 at 17:26

Tensorflow tensors are not arrays at all when they are executed, they are only references to a computational graph that is being built. You might want to review the tutorial on how Tensorflow builds graphs.

You have two problems with your loss function: first, collapsing on either axis will not yield a scalar, so it won't be possible to take a derivative and second, np.gradient doesn't appear to exist in Tensorflow.

For the first problem, you can solve it by reducing along the remaining axis of gradient_y or gradient_x. I don't know which function you might want to use because I don't know your application.

The second problem can be fixed in two ways:

You could wrap np.gradient using py_func, but you plan to use this as a loss function, so you will want to take the gradient of that function, and defining the gradient of a py_func call is complicated.
Write your own version of np.gradient using pure Tensorflow.

For example, here's a 1D np.gradient in tensorflow (untested):

def gradient(x):
    d = x[1:]-x[:-1]
    fd = tf.concat([x,x[-1]], 0).expand_dims(1)
    bd = tf.concat([x[0],x], 0).expand_dims(1)
    d = tf.concat([fd,bd], 1)
    return tf.reduce_mean(d,1)

score 0 · Answer 3 · answered Jun 11 '22 at 17:17

Here's what I came up with for 3d arrays thanks to @Bernat Gene

def tf_gradient_3d(a):
    #*axis = 1
    left = tf.concat([a[:,1:], tf.expand_dims(a[:,-1],1)], axis = 1)
    right = tf.concat([tf.expand_dims(a[:,0],1), a[:,:-1]], axis = 1)

    ones = tf.ones_like(right[:, 2:], tf.float64)
    one = tf.expand_dims(ones[:,0], 1)
    dx = tf.concat((one, ones*2, one), 1)

    gx = (left - right )/dx

    #* axis = 0
    left = tf.concat([a[1:,:], tf.expand_dims(a[-1,:],0)], axis = 0)
    right = tf.concat([tf.expand_dims(a[0,:],0), a[:-1,:]], axis = 0)
   
    ones = tf.ones_like(right[2:], tf.float64)
    one = tf.expand_dims(ones[0], 0)
    dx = tf.concat((one, ones*2, one), 0)

    gy = (left - right )/dx

    # *axis = 2
    left = tf.concat([a[:,:,1:], tf.expand_dims(a[:,:,-1],2)], axis = 2)
    right = tf.concat([tf.expand_dims(a[:,:,0],2), a[:,:,:-1]], axis = 2)
    
    ones = tf.ones_like(right[:, :, 2:], tf.float64)
    one = tf.expand_dims(ones[:,:,0], 2)
    dx = tf.concat((one, ones*2, one), 2)

    gz = (left - right )/dx

    return gx, gy, gz

hope it helps :)

p.s: what actually happens is something like this:

grad[0] = (vals[1] - vals[0]) / dx;
grad[i] = (vals[i+1] - vals[i-1]) / (2*dx);  // for i in [1,N-2]
grad[N-1] = (vals[N-1] - vals[N-2]) / dx;

but you need to convert this formula suitable for Tensorflow.

How to get the horizontal and vertical gradient of the difference between y_true and y_pred?

3 Answers3