2

By setting the random seed with tf.set_random_seed(1234), I can repeat training runs exactly, so far so good. However, I noticed slight deviations when introducing extra variables in the graph. In the following example, B and C yield the exactly the same losses, but A gives something slightly (but not altogether) different. It's important to note that in version C, intermediate_tensor ist not attached to anything.

# version A:
output_tensor = input_tensor

# version B:
intermediate_tensor = input_tensor[..., :]
output_tensor = intermediate_tensor

# version C:
intermediate_tensor = input_tensor[..., :]
output_tensor = input_tensor

I would appreciate any insights, as I cannot explain this behaviour. Is it possible that the random number generator is somehow influenced by the graph content?

Lisa
  • 3,365
  • 3
  • 19
  • 30

1 Answers1

2

Operations that rely on a random seed actually derive it from two seeds: the graph-level and operation-level seeds. This sets the graph-level seed.

If the graph-level seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence.

Yes, the PRNG has an influence. A full description is here (Warning a long text to read!). The defaults for the seeds are None. And if so they randomly initialized afterwards

seed = random::New64();
seed2 = random::New64();

And there is (guess what, a surprise) a Mersenne twister engine behind. For the GPU ops, they use then the Philox algorithm which is as well counter-based.

The same is also true when working on datasets.

You should specify a per operation random seed to prevent all this random-stuff magic.

In summary, all additional random-ops nodes change the init-seeds. But in simple cases:

import tensorflow as tf
tf.set_random_seed(1234)

out1 = tf.random_normal([1])
out2 = tf.sqrt(tf.square(tf.random_normal([1])))
# out2 = tf.random_normal([1])

with tf.Session() as sess:
  print(sess.run([out1, out2]))

The outputs are the same (commenting out/in out2 above):

[array([-0.1386252], dtype=float32), array([-1.3978306], dtype=float32)]
[array([-0.1386252], dtype=float32), array([1.3978306], dtype=float32)]

If you want these reproducible runs within one session, please refer to

Reproducible results in Tensorflow with tf.set_random_seed

Patwie
  • 4,360
  • 1
  • 21
  • 41
  • Thanks for taking the time for this elaborate answer. In [this](https://stackoverflow.com/a/51253945/2323484) answer, I found out that even non-random operations change the seeds for the following random operations, as the operation-level seeds are determined based on the ID of the previous operation. – Lisa Aug 21 '18 at 13:42