2

After going through some Stack questions and the Keras documentation, I manage to write some code trying to evaluate the gradient of the output of a neural network w.r.t its inputs, the purpose being a simple exercise of approximating a bivariate function (f(x,y) = x^2+y^2) using as loss the difference between analytical and automatic differentiation.

Combining answers from two questions (Keras custom loss function: Accessing current input pattern and Getting gradient of model output w.r.t weights using Keras ), I came up with this:

import tensorflow as tf
from keras import backend as K
from keras.models import Model
from keras.layers import Dense, Activation, Input

def custom_loss(input_tensor):

    outputTensor = model.output       
    listOfVariableTensors = model.input      
    gradients = K.gradients(outputTensor, listOfVariableTensors)

    sess = tf.InteractiveSession()
    sess.run(tf.initialize_all_variables())
    evaluated_gradients = sess.run(gradients,feed_dict={model.input:input_tensor})

    grad_pred = K.add(evaluated_gradients[0], evaluated_gradients[1])
    grad_true = k.add(K.scalar_mul(2, model.input[0][0]), K.scalar_mul(2, model.input[0][1])) 

    return K.square(K.subtract(grad_pred, grad_true))

input_tensor = Input(shape=(2,))
hidden = Dense(10, activation='relu')(input_tensor)
out = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, out)
model.compile(loss=custom_loss_wrapper(input_tensor), optimizer='adam')

Which yields the error: TypeError: The value of a feed cannot be a tf.Tensor object. because of feed_dict={model.input:input_tensor}. I understand the error, I just don't know how to fix it.

From what I gathered, I can't simply pass input data into the loss function, it must be a tensor. I realized Keras would 'understand' it when I call input_tensor. This all just leads me to think I'm doing things the wrong way, trying to evaluate the gradient like that. Would really appreciate some enlightenment.

Lucas Farias
  • 418
  • 1
  • 8
  • 22

2 Answers2

2

I don't really understand why you want this loss function, but I will provide an answer anyway. Also, there is no need to evaluate the gradient within the function (in fact, you would be "disconnecting" the computational graph). The loss function could be implemented as follows:

from keras import backend as K
from keras.models import Model
from keras.layers import Dense, Input

def custom_loss(input_tensor, output_tensor):
    def loss(y_true, y_pred):
        gradients = K.gradients(output_tensor, input_tensor)
        grad_pred = K.sum(gradients, axis=-1)
        grad_true = K.sum(2*input_tensor, axis=-1)
        return K.square(grad_pred - grad_true)
    return loss

input_tensor = Input(shape=(2,))
hidden = Dense(10, activation='relu')(input_tensor)
output_tensor = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, output_tensor)
model.compile(loss=custom_loss(input_tensor, output_tensor), optimizer='adam')
rvinas
  • 11,824
  • 36
  • 58
  • As I said, it's just a toy example that follows, roughly, the same structure of a more complex problem. Thank you a lot for helping! – Lucas Farias Apr 06 '18 at 17:06
  • 1
    By the way, is there really the need to wrap the y_true/y_pred loss function? – Lucas Farias Apr 06 '18 at 17:07
  • 1
    Yes, this is necessary because `model.compile` expects either a string (name of objective function) or an objective function with these arguments. I am happy to help you! – rvinas Apr 06 '18 at 17:11
0

A Keras loss must have y_true and y_pred as inputs. You can try adding your input object as both x and y during the fit:

def custom_loss(y_true,y_pred):
    ...
    return K.square(K.subtract(grad_true, grad_pred))

...
model.compile(loss=custom_loss, optimizer='adam')

model.fit(X, X, ...)

This way, y_true will be the batch being processed at each iteration from the input X, while y_pred will be the output of the model for that particular batch.

ebeneditos
  • 2,542
  • 1
  • 15
  • 35