0

I have tried using the various resources here on StackExchange and those provided by Keras to obtain reproducible results from my LSTM. I am currently doing the following (and failing) to ensure reproducibility, in this order:

  • I set the PYTHONHASHSEED (via a magic, as you will see I am working in a Jupyter notebook).

  • I ensure that I am only using CPU and not GPU.

  • I set random seeds for the python libraries in which any randomization might be introduced- (numpy, python random module, tensorflow).

  • I force TensorFlow to use a single thread.

  • While you may notice RandomizedSearchCV and TimeSeriesSplit are imported in the below code, I am not using them in the generation of my results, as it occurred to me that this might introduce other randomization pitfalls. Instead, I am running one training and one test set for my data, broken up by hardcoded indices, and training for only one epoch to make isolating the source of this unwanted variability easier.

Instituting the above controls has really constricted the variability for the better, but despite this, my val_loss vacillates between 2 values, and is thus not truly reproducible. I have no idea why one run will give me the one value while the next one will give me the same value or the other one with no rhyme or reason. Can someone tell me what I'm doing wrong please?

%env PYTHONHASHSEED=0

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = ""

import numpy as np
import tensorflow as tf
import random as rn

seed = 2

# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.

np.random.seed(seed)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.

rn.seed(seed)

# The below tf.set_random_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/set_random_seed

tf.set_random_seed(seed)

# Force TensorFlow to use single thread.
# Multiple threads are a potential source of non-reproducible results.
# For further details, see: https://stackoverflow.com/questions/42022950/

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
                              inter_op_parallelism_threads=1)

from keras import backend as K

sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

# Rest of code follows ..




import pandas as pd, matplotlib.pyplot as plt, scipy.stats
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization, Embedding, LSTM, Dropout
from keras.callbacks import EarlyStopping
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV, TimeSeriesSplit
from keras.optimizers import Adam
from keras.losses import binary_crossentropy

model = Sequential()
model.add(LSTM(30, input_shape = (sequences.shape[1], sequences.shape[2]), activation = 'relu'))
model.add(BatchNormalization())
model.add(Dense(30, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dense(25, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dense(1, activation = 'sigmoid'))

#compile model
model.compile(optimizer = Adam(lr=.001), loss = binary_crossentropy, metrics=['accuracy'])

pmse234
  • 15
  • 4

1 Answers1

0

I suggest checking whether your model contains nondeterministic operations. Notably, reduce_sum is one such operation. These operations are nondeterministic because floating-point addition and multiplication are nonassociative (the order in which floating-point numbers are added or multiplied affects the result) and because such operations don't guarantee their inputs are added or multiplied in the same order every time. I'm not aware of a complete list of nondeterministic TensorFlow operations. To investigate which operation may be causing this issue, try reducing the number of layers in the model or changing which layers or operations are used in that model, and see whether doing so gives you consistent results. See also this question.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
  • All tensorflow-related code was in the code snippet I included, and I scoured it after reading your interesting post, but couldn’t find any such non-deterministic operations, sadly – pmse234 Sep 27 '19 at 11:27
  • To be clear, I did not originally write that post, but merely edited it. See that post's edit history. – Peter O. Sep 27 '19 at 11:30