TensorFlow cholesky decomposition

Question

From reading the TensorFlow documentation I see that there is a method for computing the Cholesky decomposition of a square matrix. However, usually when I want to use Cholesky decomposition, I do it for the purposes of solving a linear system where direct matrix inversion might be unstable.

Therefore, I am looking for a method similar to the one implemented in Scipy. Does anyone know if this exists in TensorFlow or if there is a way it could be incorporated?

What's wrong with the one implemented in Scipy? Anyway, you can implement your own, by using simple substitutions (see https://en.wikipedia.org/wiki/Triangular_matrix#Forward_and_back_substitution) — Berci, Nov 14 '15 at 15:54
My impression is that if I execute a TensorFlow session to obtain a numpy array to hand off to scipy I will prematurely terminate the computational graph that TensorFlow constructs (because solving the linear system is not the end of the line for my purposes). Granted, I haven't tried this but Theano would work in this way and I imagine that would extend to TensorFlow as well. I will roll my own until a built-in solution emerges. — user1936768, Nov 14 '15 at 16:04

score 3 · Answer 1 · answered Nov 14 '15 at 18:29

user19..8: The way to do this for now if you want to keep things "mostly" in tensorflow would be to do what you and Berci were discussing in the comments: Run the tensorflow graph until the point where you need to solve the linear system, and then feed the results back in with a feed_dict. In pseudocode:

saved_tensor1 = tf.Variable(...)
saved_tensor2 = tf.Variable(...)

start_of_model...
tensor1, tensor2 = various stuff...
do_save_tensor1 = saved_tensor1.assign(tensor1)
do_save_tensor2 = saved_tensor2.assign(tensor2)
your_cholesky = tf.cholesky(your_other_tensor, ...)

## THIS IS THE SPLIT POINT
# Second half of your model starts here
solved_system = tf.placeholder(...)  # You'll feed this in with feed_dict
final_answer = do_something_with(saved_tensor1, saved_tensor2, solved_system)

Then to run the whole thing, do:

_, _, cho = tf.run([do_save_tensor1, do_save_tensor2, your_cholesky])
solution = ... solve your linear system with scipy ...
feed_dict = {solved_system: solution}
answer = tf.run(final_answer, feed_dict=feed_dict)

The key here is stashing your intermediate results in tf.Variables so that you can resume the computation afterwards.

(I'm not promising that what you get out of tf.cholesky is in the right format to feed directly to scipy, or that you shouldn't just pull out the matrix in an earlier step and feed it to scipy, but this overall workflow should work for you).

Note that this will create a performance bottleneck if you're doing heavily multicore or GPU operations and then have to serialize on spitting the matrix out to scipy, but it might also be just fine - depends a lot on your setting.

Lets say I was using a differentiable function instead of Cholesky decomposition (I'm not sure if it's possible to differentiate through CD; it might be: [here](http://deeplearning.net/software/theano/library/tensor/slinalg.html#theano.tensor.slinalg.CholeskyGrad)). Then would your recommendation to break out of the computational graph and then re-enter at a later stage limit TensorFlow's ability to backpropagate through all the mathematical operations? It seems to me that it would... — user1936768, Nov 14 '15 at 18:43
Correct - TF can't backprop through a feed_dict. Do you need to do so, or is this out of curiosity? :) (Tensorflow's Cholesky doesn't have a gradient function, so the base case version wouldn't work.) — dga, Nov 14 '15 at 20:59
I don't need to backprop through CD, but I do need to backprop through a matrix determinant. I was dismayed to see that TensorFlow's matrix determinant also doesn't have a built-in registered gradient. Do you know if there's an easy way to implement Jacobi's formula using the `@tf.RegisterGradient` decorator in TensorFlow? — user1936768, Nov 14 '15 at 21:41
I don't - we're exceeding my linear algebra. Let me yell for help. :) — dga, Nov 14 '15 at 22:08
I was able to find a work around for that particular issue myself (see my answer [here](http://stackoverflow.com/questions/33714832/matrix-determinant-differentiation-in-tensorflow/)). Still... I wonder if there's a more numerically stable way to accomplish the same thing that avoids the matrix inverse (part of the reason I was using CD in the first place). — user1936768, Nov 15 '15 at 04:18

rhaertel80 · Answer 2 · 2017-04-24T04:07:19.670

Update (2017/04/23)

TensorFlow now has many linear algebra operations. For instance, checkout tf.cholesky_solve, tf.matrix_solve_ls, tf.matrix_solve, tf.qr, tf.svd, etc. Of course, the original answer below may be helpful as well.

Original Does matrix_inverse do what you need? It uses Cholesky or LU Decomposition, depending on the input. For example,

>>> import tensorflow as tf
>>> x = [[1.,1.],[-2.,3.],[1.,-1.]]
>>> y = [[-1.],[-8.],[3.]]
>>> a = tf.matrix_inverse(tf.matmul(x, x, transpose_a=True))
>>> b = tf.matmul(tf.matmul(a, x, transpose_b=True), y)
>>> with tf.Session():
...   print b.eval()
... 
[[ 1.]
 [-2.]]

TensorFlow cholesky decomposition

2 Answers2

Linked