Which seeds have to be set where to realize 100% reproducibility of training results in tensorflow?

Question

In a general tensorflow setup like

model = construct_model()
with tf.Session() as sess:
    train_model(sess)

Where construct_model() contains the model definition including random initialization of weights (tf.truncated_normal) and train_model(sess) executes the training of the model -

Which seeds do I have to set where to ensure 100% reproducibility between repeated runs of the code snippet above? The documentation for tf.random.set_random_seed may be concise, but left me a bit confused. I tried:

tf.set_random_seed(1234)
model = construct_model()
    with tf.Session() as sess:
        train_model(sess)

But got different results each time.

You also need to remove parallelism from your computation because that is often non-deterministic, turn off GPU and use `sess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=1,intra_op_parallelism_threads=1)` — Yaroslav Bulatov, Feb 03 '17 at 15:58
Also, some non-determinism is caused by using modern instruction sets like SSE (see [here](http://blog.nag.com/2011/02/wandering-precision.html) ), so to get 100% reproducibility you may need to recompile TF without using SSE — Yaroslav Bulatov, Feb 03 '17 at 19:31
Just for clarification, the above `sess = tf.Session...` in the comments does not turn off the GPU, as observed by `watch nvidia-smi` (in the case of an nvidia gpu, as on AWS EC2 p2.xlarge instances) — Shadi, Sep 08 '17 at 04:08
https://stackoverflow.com/questions/32419510/how-to-get-reproducible-results-in-keras?noredirect=1&lq=1 might be useful. — Dr Nisha Arora, Sep 15 '19 at 02:23

score 1 · Answer 1 · answered Feb 03 '17 at 11:29

1

One possible reason is that when constructing the model, there are some code using numpy.random module. So maybe you can try to set the seed for numpy, too.

answered Feb 03 '17 at 11:29

Jiren Jin

377
2
5

score 1 · Accepted Answer · edited Oct 27 '22 at 23:35

1

The best solution which works as of today with GPU is to install tensorflow-determinism with the following:

pip install tensorflow-determinism

Then include the following code to your code

import tensorflow as tf
import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'

source: https://github.com/NVIDIA/tensorflow-determinism

edited Oct 27 '22 at 23:35

desertnaut

57,590
26
140
166

answered Oct 16 '19 at 08:33

eugen

1,249
9
15

score 0 · Answer 3 · answered Mar 21 '21 at 06:52

What has worked for me is following this answer with a few modifications:

import tensorflow as tf
import numpy as np
import random

# Setting seed value
# from https://stackoverflow.com/a/52897216
# generated randomly by running `random.randint(0, 100)` once
SEED = 75
# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
os.environ['PYTHONHASHSEED'] = str(SEED)
# 2. Set the `python` built-in pseudo-random generator at a fixed value
random.seed(SEED)
# 3. Set the `numpy` pseudo-random generator at a fixed value
np.random.seed(SEED)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
tf.random.set_seed(SEED)

I was not able to figure out how to set the session seed (step 5), but it didn't seem like it was necessary.

I am running Google Colab Pro on a high-RAM TPU, and my training results (the graph of the loss function) have been exactly the same three times in a row with this method.

score 0 · Answer 4 · answered Oct 27 '22 at 12:41

0

SEED = 42
import os
import random

os.environ["TF_DETERMINISTIC_OPS"] = "1"
keras.utils.set_random_seed(SEED)
os.environ['PYTHONHASHSEED']=str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

answered Oct 27 '22 at 12:41

Vijay Mariappan

16,921
3
40
59

Which seeds have to be set where to realize 100% reproducibility of training results in tensorflow?

4 Answers4

Linked