Unable to reproduce same values in multiple runs on Tensorflow

Question

In my project for training a CNN model for classification, I have the issue of randomness in terms of results.

I have set seed for both Numpy and Tensorflow at the start of session.

I also have checked and made sure all the initialisations [i.e, data shuffle, weights initialisation for all layers are same for every run] Anyways the results [i.e, cost, accuracy] vary for every run.

Also the weights of all layers vary after one epoch.

This is the cost function I'm using

cost=tf.reduce_mean(tf.losses.hinge_loss(logits = pred, labels = labels),name=name)

These are the other functions I use for calculating Accuracy where Z is the output from last layer of the model

pred=tf.argmax(Z,axis=1,name="predictions")
true_preds=tf.equal(pred,tf.argmax(Y,axis=1))
accuracy=tf.reduce_mean(tf.cast(true_preds,"float"),name="Accuracy")

So even when I run the model for only one epoch and even then the results are not repeated. But the initialisations are all consistent just to reiterate. Also I run my code on only one GPU and no parallelism is done.

Have gone through similar questions but could'nt find a solution, please direct if i'm missing any such solved question.

This question might be relevant here:[How to handle non-determinism when training on a GPU?](https://stackoverflow.com/questions/50744565/how-to-handle-non-determinism-when-training-on-a-gpu/50746919#50746919). Spoiler alert: 'Unless you are debugging an issue, it is OK to have fluctuations between runs. Uncertainty is in the nature of training, and it is wise to measure it and take it into account when comparing results -- even when toolkits eventually reach perfect determinism in training' — meissner_, Jun 08 '18 at 08:43
i think you omitted the important part of the codes, where seed is applied,i think after 2 epochs the results should different with a negligeable difference ,but if the difference is in accuracy is is more than 4 lets say,then something is wrong,Usually because of randomizing data, the results aren't always the same — Eliethesaiyan, Jun 08 '18 at 08:46
I have added seed part at the beginning of the code which I believe will maintain same seed globally. Also as mentioned the seed plays its role only during initialisation and calling random function which is done before the training is even starting and at that stage we have all values same. @Eliethesaiyan — BSR, Jun 08 '18 at 09:10

Unable to reproduce same values in multiple runs on Tensorflow

0 Answers0