user19..8: The way to do this for now if you want to keep things "mostly" in tensorflow would be to do what you and Berci were discussing in the comments: Run the tensorflow graph until the point where you need to solve the linear system, and then feed the results back in with a feed_dict. In pseudocode:
saved_tensor1 = tf.Variable(...)
saved_tensor2 = tf.Variable(...)
start_of_model...
tensor1, tensor2 = various stuff...
do_save_tensor1 = saved_tensor1.assign(tensor1)
do_save_tensor2 = saved_tensor2.assign(tensor2)
your_cholesky = tf.cholesky(your_other_tensor, ...)
## THIS IS THE SPLIT POINT
# Second half of your model starts here
solved_system = tf.placeholder(...) # You'll feed this in with feed_dict
final_answer = do_something_with(saved_tensor1, saved_tensor2, solved_system)
Then to run the whole thing, do:
_, _, cho = tf.run([do_save_tensor1, do_save_tensor2, your_cholesky])
solution = ... solve your linear system with scipy ...
feed_dict = {solved_system: solution}
answer = tf.run(final_answer, feed_dict=feed_dict)
The key here is stashing your intermediate results in tf.Variables so that you can resume the computation afterwards.
(I'm not promising that what you get out of tf.cholesky is in the right format to feed directly to scipy, or that you shouldn't just pull out the matrix in an earlier step and feed it to scipy, but this overall workflow should work for you).
Note that this will create a performance bottleneck if you're doing heavily multicore or GPU operations and then have to serialize on spitting the matrix out to scipy, but it might also be just fine - depends a lot on your setting.