What caching model does TensorFlow use?

Question

I read the question here TensorFlow - get current value of a Variable and the answer has left me confused.

On one hand, dga says "And to be very clear: Running the variable will produce only the current value of the variable; it will not run any assign operations associated with it. It's cheap."

On the other hand, Salvador Dali says "@dga yes, if the variable depends on n other variables, they also need to be evaluated."

So, which is it? Does evaluating the variable only return its current value, or does it recompute its value from scratch from the variables it depends on?

What happens if I evaluate the same variable twice in a row? Does Tensorflow have any notion of "stale" variables, i.e. variables that need to be recomputed because their dependencies actually changed (i.e. like in build system)?

I ask because I work with multiple nets where the partial output of one net becomes the partial input of another net. I want to fetch the gradients computed at the input layer of one net and merge+apply them to the output layer of another net. I was hoping to do this by manually retrieving/storing gradients in the variables of a graph, and then running graph operations to backpropagate the gradients. Thus I need to understand how it all works under the hood.

What I do is similar to this How to use Tensorflow Optimizer without recomputing activations in reinforcement learning program that returns control after each iteration?, but I can't conclude whether it's possible based on the last answer (experimental support now in?)

Thanks!

score 2 · Accepted Answer · answered Sep 07 '16 at 23:22

@dga is correct. If you pass a tf.Variable object to tf.Session.run() TensorFlow will return the current value of the variable, and it will not perform any computation. It is cheap (the cost of a memory copy, or possibly a network transfer in the case of a distributed TensorFlow setup). TensorFlow does not retain any history* about how the value of a tf.Variable was updated, so it cannot in general recompute its value from scratch.

(* Technically TensorFlow remembers the tf.Tensor that was used to initialize each variable, so it is possible to recompute the inital value of the variable.)

OK, everything is cached. Thank you. – Sep 08 '16 at 15:33 — , Sep 08 '16 at 15:33

What caching model does TensorFlow use?

1 Answers1