Tensorflow 2: Why the same `name` of `W`s and `b`s in Multi-Layer Perceptron Neural Network?

Question

I'm new to TensorFlow 2 and reading the docs: https://www.tensorflow.org/api_docs/python/tf/Module

On this page, the part related to my question is: MLP (copy-paste from there):

class MLP(tf.Module):
  def __init__(self, input_size, sizes, name=None):
    super(MLP, self).__init__(name=name)
    self.layers = []
    with self.name_scope:
      for size in sizes:
        self.layers.append(Dense(input_dim=input_size, output_size=size))
        input_size = size
  @tf.Module.with_name_scope
  def __call__(self, x):
    for layer in self.layers:
      x = layer(x)
    return x

and I don't understand why the output of the following:

>>> module = MLP(input_size=5, sizes=[5, 5])
>>> module.variables
(<tf.Variable 'mlp/b:0' shape=(5,) ...>,
<tf.Variable 'mlp/w:0' shape=(5, 5) ...>,
<tf.Variable 'mlp/b:0' shape=(5,) ...>,
<tf.Variable 'mlp/w:0' shape=(5, 5) ...>,
)

Where I expect mlp/b:1 and mlp/w:1 would appear. I also tried the same code on my machine and got the same result on name, i.e. both mlp/b:0 and mlp/w:0 appear twice. Can anyone help me point out which point I have missed? Would the result mean that the same W, b are reused?

Shivam Miglani · Accepted Answer · 2022-07-19T08:25:26.407

1

From the docs,

A tf.Variable represents a tensor whose value can be changed by running ops on it. Specific ops allow you to read and modify the values of this tensor. Higher level libraries like tf.keras use tf.Variable to store model parameters.

The :0 is not in any way the layer number. It is used to represent output tensor of an op in the underlying API.

For example, tf.Variable allocates one tensor [:0], whereas 3-way split via tf.split allocates three tensors [:0,:1,:2] for their respective op in computational graph

tf.Variable([1])
# has output
# <tf.Variable 'Variable:0' shape=(1,) dtype=int32, numpy=array([1], dtype=int32)>

and

tf.compat.v1.disable_eager_execution()
a,b,c = tf.split([1,1,1], 3)
print(a.name)   # split:0
print(b.name)   # split:1
print(c.name)   # split:2

Refer to this post

edited Jul 19 '22 at 08:25

answered Jul 18 '22 at 14:53

Shivam Miglani

542
3
9

So tensorflow would help me reuse the same space behind the scene, even when I seems to have created 4 variables (2 per Dense) ? – NeoZoom.lua Jul 19 '22 at 01:51
1

It is not the same memory. `op:number` is just notation in underlying API to denote that one or two or three or n memory allocations (tensors) are the output of an op. In our example, In case of tf.Variable op, one tensor is required, therefore it will always be :0 In case of 3-way split op, three tensors will be output, therefore, it will always be :0,:1,:2 5-way split will need 5 tensors as output ... and so on. – Shivam Miglani Jul 19 '22 at 08:28
If you have time, please help me: https://stackoverflow.com/q/73033488/5290519 – NeoZoom.lua Jul 19 '22 at 08:50

Tensorflow 2: Why the same `name` of `W`s and `b`s in Multi-Layer Perceptron Neural Network?

1 Answers1

Linked