0

I have read several other posts about how to get reproducible results using tensorflow/keras. However, I am still getting varying results.

The additional random components of my script include a sklearn.train_test_split() which I set a random_state = 123456 and also a ImageDataGenerator from Keras for data augmentation. I am setting the seed there as well.

Below is the code and where I set the seed. I am running my model for 1 epoch to compare the accuracies and they are different every time. I am setting the random seeds at the top of my code and then within the functions I use the random state. If I don't utilize the random_state in sklearn.model_selection.train_test_split or seed in keras.preprocessing.image.ImageDataGenerator I also do not get the same results.

# seeding necessary for neural network reproducibility at top of the code
SEED = 123456
import os
import random as rn
import numpy as np

os.environ['PYTHONHASHSEED']=str(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)
rn.seed(SEED)

# within another function
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size = validation_size, random_state = 123456)

# in the model building function
data_generator = ImageDataGenerator(
    rotation_range = 15,
    width_shift_range = 0.1,
    height_shift_range = 0.1,
    horizontal_flip = True
    #zoom_range = [0.5, 1.0] # half zoom to double zoom possibilities
)
data_generator.fit(xtrain, seed = 123456)
Coldchain9
  • 1,373
  • 11
  • 31

2 Answers2

2

After some trial and error. I found there is a library for NVIDIA GPU (which I am using) that will give me deterministic results.

First I install the library with pip install tensorflow-determinism Then set os environment variable:

os.environ['TF_DETERMINISTIC_OPS'] = '1'. I am now getting reproducible results:

Test error: 1.6772334575653076, test accuracy: 0.5085999965667725
Test error: 1.6772334575653076, test accuracy: 0.5085999965667725
Coldchain9
  • 1,373
  • 11
  • 31
0

I think your problem are how the keras model parameters are initialized. You can specify your own initializers to get reproducible behaviour as documented here

Yannick Funk
  • 1,319
  • 10
  • 23
  • Why wouldn't my global seeding handle that? Surely keras is using tensorflow's seed and I cover that, right? – Coldchain9 Sep 20 '20 at 14:47
  • @Coldchain9 look at the discussion here: https://stackoverflow.com/questions/45230448/how-to-get-reproducible-result-when-running-keras-with-tensorflow-backend they covered it already – Yannick Funk Sep 20 '20 at 14:49
  • A lot of these discussions are saying I need to disable my GPU to get reproducible results. That just isn't feasible for what I am trying to do. – Coldchain9 Sep 20 '20 at 15:05
  • 1
    Think I found a solution. I need to ```pip install tensorflow-determinism``` and ```os.environ['TF_DETERMINISTIC_OPS'] = '1'``` at the top of my code. Looks like I am getting same results across training sessions now. Thank you for leading me in the right path. – Coldchain9 Sep 20 '20 at 15:13