3

I am trying to implement an algorithm that requires a vector to be multiplied by the hessian of a neural network's output with respect to its weights. I am having trouble figuring out the syntax to get to the theano back end to successfully perform the calculation.

For now, I'm willing to calculate the full Hessian if need be, but I'd prefer to use an optimization called the R-operator that makes the Hessian times the vector more efficient.

I (think I) know what I want to do can't be done in Keras itself, and I've found that Theano implements the R-operator (Tensorflow doesn't seem to as far as I have found) so I'm attempting to get access to the Theano back end, extract the output tensor and weights, and use Theano's gradient functions to perform the calculation.

I've been referring to the following links: Accessing gradient values of keras model outputs with respect to inputs

http://deeplearning.net/software/theano/tutorial/gradients.html

http://deeplearning.net/software/theano/tutorial/examples.html

http://deeplearning.net/software/theano/library/gradient.html

I've tried a few variations, such as:

# Calculate the full hessian manually with keras.backend
outputTensor = theta_nnet.model.output 
listOfVariableTensors = theta_nnet.model.trainable_weights
gradients = k.gradients(outputTensor[0,0], listOfVariableTensors)
hessian   = k.gradients( gradients, listOfVariableTensors )
# Errors out with "AttributeError" because the cost function on the hessian gradients call has no attribute named type.

Another example:

# Calculate the full hessian manually by using theano grad 
outputTensor = theta_nnet.model.output 
listOfVariableTensors = theta_nnet.model.trainable_weights
theano_grad = T.grad(outputTensor[0,0], listOfVariableTensors)
theano_Hv = T.grad(T.sum(theano_grad * w), listOfVariableTensors)
# AsTensorError exception trying to convert w, a numpy ndarray, to a tensor

I also had some variations where I attempt to use T.hessian and T.rop. I think my issue is more fundamental in understanding how I can perform derivatives on a trained neural net. This page http://deeplearning.net/software/theano/tutorial/gradients.html was very helpful, but it didn't have the final step of demonstrating how to apply this to a neural net and I couldn't figure it out from the other sources (and lots of other googling.)

If anyone can provide any pointers, it would be greatly appreciated.

jrjbertram
  • 390
  • 3
  • 10

0 Answers0