Implementation difference between TensorFlow Variable and TensorFlow Tensor

Question

First of all, I am aware that a related question has been asked here.

However, this question is about the implementation and internals. I was reading the paper "A Tour of TensorFlow". The following two points are quoted from there:

1.

A tensor itself does not hold or store values in memory, but provides only an interface for retrieving the value referenced by the tensor.

This suggests to me that a Tensor is an object that simply stores the pointer to a result of an operation and, on retrieving the result or value of the tensor, it simply dereferences that pointer.

2.

Variables can be described as persistent, mutable handles to in-memory buffers storing tensors. As such, variables are characterized by a certain shape and a fixed type.

At this I get confused because I thought, based on the previous point, that Tensors simply store a pointer. If they were simply pointers, they could be mutable as well.

To be precise these are my questions:

What is the meaning of "in-memory buffers"?
What is the meaning of a "handle"?
Is my initial assumption about the internals of a tensor correct?
What is the essential internal implementation difference between a tensor and a variable? Why are they declared differently and why is that difference essential to TensorFlow?

score 52 · Accepted Answer · answered Nov 29 '16 at 16:39

Before explaining the distinction between tensors and variables, we should be precise about what the word "tensor" means in the context of TensorFlow:

In the Python API, a tf.Tensor object represents the symbolic result of a TensorFlow operation. For example, in the expression t = tf.matmul(x, y), t is a tf.Tensor object representing the result of multiplying x and y (which may themselves be symbolic results of other operations, concrete values such as NumPy arrays, or variables).

In this context, a "symbolic result" is more complicated than a pointer to the result of an operation. It is more analogous to a function object that, when called (i.e. passed to tf.Session.run()) will run the necessary computation to produce the result of that operation, and return it to you as a concrete value (e.g. a NumPy array).
In the C++ API, a tensorflow::Tensor object represents the concrete value of a multi-dimensional array. For example, the MatMul kernel takes two two-dimensional tensorflow::Tensor objects as inputs, and produces a single two-dimensional tensorflow::Tensor object as its output.

This distinction is a little confusing, and we might choose different names if we started over (in other language APIs, we prefer the name Output for a symbolic result and Tensor for a concrete value).

A similar distinction exists for variables. In the Python API, a tf.Variable is the symbolic representation of a variable, which has methods for creating operations that read the current value of the variable, and assign values to it. In the C++ implementation, a tensorflow::Var object is a wrapper around a shared, mutable tensorflow::Tensor object.

With that context out the way, we can address your specific questions:

What is the meaning of "in-memory buffers"?

An in-memory buffer is simply a contiguous region of memory that has been allocated with a TensorFlow allocator. tensorflow::Tensor objects contain a pointer to an in-memory buffer, which holds the values of that tensor. The buffer could be in host memory (i.e. accessible from the CPU) or device memory (e.g. accessible only from a GPU), and TensorFlow has operations to move data between these memory spaces.
What is the meaning of a "handle"?

In the explanation in the paper, the word "handle" is used in a couple of different ways, which are slightly different from how TensorFlow uses the term. The paper uses "symbolic handle" to refer to a tf.Tensor object, and "persistent, mutable handle" to refer to a tf.Variable object. The TensorFlow codebase uses "handle" to refer to a name for a stateful object (like a tf.FIFOQueue or tf.TensorArray) that can be passed around without copying all of the values (i.e. call-by-reference).
Is my initial assumption about the internal of a tensor correct?

Your assumption most closely matches the definition of a (C++) tensorflow::Tensor object. The (Python) tf.Tensor object is more complicated because it refers to a function for computing a value, rather than the value itself.
What is the essential internal implementation difference between a tensor and a variable?

In C++, a tensorflow::Tensor and tensorflow::Var are very similar; the only different is that tensorflow::Var also has a mutex that can be used to lock the variable when it is being updated.

In Python, the essential difference is that a tf.Tensor is implemented as a dataflow graph, and it is read-only (i.e. by calling tf.Session.run()). A tf.Variable can be both read (i.e. by evaluating its read operation) and written (e.g. by running an assign operation).

Why are they declared differently and why is that difference essential to TensorFlow?

Tensors and variables serve different purposes. Tensors (tf.Tensor objects) can represent complex compositions of mathematical expressions, like loss functions in a neural network, or symbolic gradients. Variables represent state that is updated over time, like weight matrices and convolutional filters during training. While in principle you could represent the evolving state of a model without variables, you would end up with a very large (and repetetive) mathematical expression, so variables provide a convenient way to materialize the state of the model, and—for example—share it with other machines for parallel training.

It was a very informative explanation and thanks for it. If I understood correctly, read-only state of tensors is useful since, they are used to represent intermediate operations which are not of interest with respect to the final state of the model. Is that the same reason, why we store kernels of convolution parameters as tf.Variable so that it can be later written during updates and stored later on ? — Ujjwal, Nov 29 '16 at 17:00

Implementation difference between TensorFlow Variable and TensorFlow Tensor

1 Answers1

Linked