I would like to get reproducible results for my tensorflow runs. The way I'm trying to make this happen is to set up the numpy and tensorflow seeds:
import numpy as np
rnd_seed = 1
np.random.seed(rnd_seed)
import tensorflow as tf
tf.set_random_seed(rnd_seed)
As well as make sure that the weights of the neural network, that I initialized with tf.truncated_normal
also use that seed: tf.truncated_normal(..., seed=rnd_seed)
For reasons that are beyond the scope of this question, I'm using the sampled softmax loss function, tf.nn.sampled_softmax_loss
, and unfortunately, I'm not able to control the stochasticity of this function with a random seed.
By a look at the TensorFlow documentation of this function (https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss), I can see that parameter sampled_values
should be the only parameter that affects randomization, but I'm not able to understand how to actually use a seed.
[EDITED] This is (part of) my script
import numpy as np
# set a seed so that the results are consistent
rnd_seed = 1
np.random.seed(rnd_seed)
import tensorflow as tf
tf.set_random_seed(rnd_seed)
embeddings_ini = np.random.uniform(low=-1, high=1, size=(self.vocabulary_size, self.embedding_size))
with graph.as_default(), tf.device('/cpu:0'):
train_dataset = tf.placeholder(tf.int32, shape=[None, None])
train_labels = tf.placeholder(tf.int32, shape=[None, 1])
valid_dataset = tf.constant(self.valid_examples, dtype=tf.int32)
# Variables.
initial_embeddings = tf.placeholder(tf.float32, shape=(self.vocabulary_size, self.embedding_size))
embeddings = tf.Variable(initial_embeddings)
softmax_weights = tf.Variable(
tf.truncated_normal([self.vocabulary_size, self.embedding_size],
stddev=1.0 / math.sqrt(self.embedding_size), seed=rnd_seed))
softmax_biases = tf.Variable(tf.zeros([self.vocabulary_size]))
# Model.
# Look up embeddings for inputs.
if self.model == "skipgrams":
# Skipgram model
embed = tf.nn.embedding_lookup(embeddings, train_dataset)
elif self.model == "cbow":
# CBOW Model
embeds = tf.nn.embedding_lookup(embeddings, train_dataset)
embed = tf.reduce_mean(embeds, 1, keep_dims=False)
# Compute the softmax loss, using a sample of the negative labels each time.
loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(weights=softmax_weights,
biases=softmax_biases,
inputs=embed,
labels=train_labels,
num_sampled=self.num_sampled,
num_classes=self.vocabulary_size))