Hessian in theano with respect to matrix input

Question

I'm trying to get a vectorized version of theano gradient and hessian, i.e. I want to compute gradient and hessian at several points, given in a matrix as shown below:

I have a function:

f(x_1,x_2,..,x_n)=exp(x_1^2+x_2^2+...+x_n^2)

and I want to compute its gradient at multiple points with one command. I can do this like so:

x = T.matrix('x')
y = T.diag(T.exp(T.dot(x,x.T)))
J = theano.grad(cost = y.sum(), wrt = x)
f = theano.function(inputs = [x], outputs = J)
f([[1,2],[3,4]])

It returns a matrix, which rows are gradients computed at points (1,2) and (3,4). I want to get the same result for hessian (in this case it would be a 3 dimensional tensor as oppose to a matrix, but the same idea). The following code:

H = theano.gradient.hessian(cost = y.sum(), wrt = x)

returns an error:

AssertionError: tensor.hessian expects a (list of) 1 dimensional variable as `wrt`

I was able to achieve the appropriate result with following code

J = theano.grad(cost = y.sum(), wrt = x)
H = theano.gradient.jacobian(expression = J.flatten(), wrt = x)
g = theano.function(inputs = [x], outputs = H)
g([[1,2],[3,4]])

but it produces a lot of unnecessary zeros and seems like an inefficient and "ugly" way of obtaining the desired result. Has anyone had a similar problem or can you suggest anything?

Hessian in theano with respect to matrix input

0 Answers0