9

In a general tensorflow setup like

model = construct_model()
with tf.Session() as sess:
    train_model(sess)

Where construct_model() contains the model definition including random initialization of weights (tf.truncated_normal) and train_model(sess) executes the training of the model -

Which seeds do I have to set where to ensure 100% reproducibility between repeated runs of the code snippet above? The documentation for tf.random.set_random_seed may be concise, but left me a bit confused. I tried:

tf.set_random_seed(1234)
model = construct_model()
    with tf.Session() as sess:
        train_model(sess)

But got different results each time.

Carlos Mermingas
  • 3,822
  • 2
  • 21
  • 40
Oblomov
  • 8,953
  • 22
  • 60
  • 106
  • 2
    You also need to remove parallelism from your computation because that is often non-deterministic, turn off GPU and use `sess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=1,intra_op_parallelism_threads=1)` – Yaroslav Bulatov Feb 03 '17 at 15:58
  • 1
    Also, some non-determinism is caused by using modern instruction sets like SSE (see [here](http://blog.nag.com/2011/02/wandering-precision.html) ), so to get 100% reproducibility you may need to recompile TF without using SSE – Yaroslav Bulatov Feb 03 '17 at 19:31
  • Just for clarification, the above `sess = tf.Session...` in the comments does not turn off the GPU, as observed by `watch nvidia-smi` (in the case of an nvidia gpu, as on AWS EC2 p2.xlarge instances) – Shadi Sep 08 '17 at 04:08
  • https://stackoverflow.com/questions/32419510/how-to-get-reproducible-results-in-keras?noredirect=1&lq=1 might be useful. – Dr Nisha Arora Sep 15 '19 at 02:23

4 Answers4

1

One possible reason is that when constructing the model, there are some code using numpy.random module. So maybe you can try to set the seed for numpy, too.

Jiren Jin
  • 377
  • 2
  • 5
1

The best solution which works as of today with GPU is to install tensorflow-determinism with the following:

pip install tensorflow-determinism

Then include the following code to your code

import tensorflow as tf
import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'

source: https://github.com/NVIDIA/tensorflow-determinism

desertnaut
  • 57,590
  • 26
  • 140
  • 166
eugen
  • 1,249
  • 9
  • 15
0

What has worked for me is following this answer with a few modifications:

import tensorflow as tf
import numpy as np
import random

# Setting seed value
# from https://stackoverflow.com/a/52897216
# generated randomly by running `random.randint(0, 100)` once
SEED = 75
# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
os.environ['PYTHONHASHSEED'] = str(SEED)
# 2. Set the `python` built-in pseudo-random generator at a fixed value
random.seed(SEED)
# 3. Set the `numpy` pseudo-random generator at a fixed value
np.random.seed(SEED)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
tf.random.set_seed(SEED)

I was not able to figure out how to set the session seed (step 5), but it didn't seem like it was necessary.

I am running Google Colab Pro on a high-RAM TPU, and my training results (the graph of the loss function) have been exactly the same three times in a row with this method.

Pro Q
  • 4,391
  • 4
  • 43
  • 92
0
SEED = 42
import os
import random

os.environ["TF_DETERMINISTIC_OPS"] = "1"
keras.utils.set_random_seed(SEED)
os.environ['PYTHONHASHSEED']=str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)
Vijay Mariappan
  • 16,921
  • 3
  • 40
  • 59