12

I am using Keras to build a deep learning LSTM model, using TensorFlow backend. Each time I run the model, the result is different. Is there a way to fix the seed to create reproducible results? Thank you!

Edamame
  • 23,718
  • 73
  • 186
  • 320
  • 1
    Please have a look at my response here (https://stackoverflow.com/a/52897216/9024698) for when using the CPU. – Outcast Oct 22 '18 at 09:55

3 Answers3

10

As @Poete_Maudit said here: How to get reproducible results in keras

To get reproducible results you will have to do the following at the very beginning of your script (that will be forced to use a single CPU):

# Seed value (can actually be different for each attribution step)
seed_value= 0

# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value) # tensorflow 2.x
# tf.set_random_seed(seed_value) # tensorflow 1.x

# 5. Configure a new global `tensorflow` session
from keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

Note: You cannot (anymore) get reproducible results using command: PYTHONHASHSEED=0 python3 script.py, as https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development might let you think, and you have to set PYTHONHASHSEED with os.environ within your script as in step #1. Also, this does NOT work for GPU usage.

  • This code snippet is not working for me. I'm using Keras API (2.2.4-tf) from TensorFlow (version 1.14.0) for deep learning in Google Co-lab. Please suggest – Dr Nisha Arora Sep 15 '19 at 02:53
  • 1
    For Tensorflow 2.0 `tf.random.set_random_seed(seed_value)` changed to `tf.random.set_seed(seed_value)` – Giora Simchoni Jun 06 '20 at 15:43
3

There is an inherent randomness associated with deep learning leading to non reproducible results, But you can control it up to certain extent.

Since we are using Deep neural network, we can have different randomness affecting our reproducibility leading to different results such as

  • Randomness in Initialization, such as weights.

  • Randomness in Regularization, such as dropout.

  • Randomness in Layers.

  • Randomness in Optimization.

But there are several ways to mitigate this one option is to use summary statistics. Another method that will provide more reproducible result is to use a random seed with numpy and/or tensorflow, see:

https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.random.seed.html

https://www.tensorflow.org/api_docs/python/tf/set_random_seed

For the methods that are using GPUs we could specify it to use a deterministic method instead of the default non-deterministic method.For nvidia graphic cards see: docs.nvidia.com/cuda

codeslord
  • 2,172
  • 14
  • 20
0

Basically, the key idea of making the result reproducible is to disable GPU. This is very important. To do this, just include

import os
import tensorflow as tf
import numpy as np
import random as rn

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = ""

sd = 1 # Here sd means seed.
np.random.seed(sd)
rn.seed(sd)
os.environ['PYTHONHASHSEED']=str(sd)

from keras import backend as K
config = tf.ConfigProto(intra_op_parallelism_threads=1,inter_op_parallelism_threads=1)
tf.set_random_seed(sd)
sess = tf.Session(graph=tf.get_default_graph(), config=config)
K.set_session(sess)

at the very beginning your code. Hope this can help you.

guorui
  • 871
  • 2
  • 9
  • 21