3

I have a small network. Trained [many hours] and saved to a checkpoint. Now, I want to restore from checkpoint, in a different script, and use it. I recreate the session: build the entire network, s.t. all ops are created again, using the exact same code I did before training. This code sets the random seed for TF, using time.time() [which is different every run].

I then restore from a checkpoint. I run the network, and get different numbers [small but meaningful differences] every time I run the restored network. Crucially, the input is fixed. If I fixate the random seed to some value, the non deterministic behavior goes away.

I am puzzled because I thought that a restore [no Variables were given to save, so I presume all graph was checkpointed] eliminates all random behavior from this flow. Initializations etc. are overridden by the restored checkpoint, this is only a forward run.

Is this possible? make sense? Is there a way to find out what variables or factors in my graph are not set by the restored checkpoint?

amitmi
  • 59
  • 1
  • 3
  • This suggests you have an op which uses random numbers (shuffle_batch, dropout, random_contrast). See if any of your ops accept `seed` during construction time. It shouldn't run any Variable initialization automatically, and if you don't run `initialize_all_variables`, next attempt to use variable will trigger Exception – Yaroslav Bulatov May 02 '16 at 14:12
  • 1
    Did you use dropouts? – Sung Kim May 02 '16 at 14:56
  • 1
    By "forward run" you mean you're only using the restored network for prediction right? In that case, should it matter whether dropout was used during training? I would think not, but I'm no expert. – Aenimated1 May 02 '16 at 15:15
  • 4
    Thanks!! It was the dropout of course. Turns out you have to explicitly set the keep prob to 1.0 during prediction; I thought it is only applied during training. Thanks for the tip re selective initialization. – amitmi May 02 '16 at 15:22
  • @YaroslavBulatov, I'm facing a similar error with `DropoutWrapper`. I set the `keep_prob=0.8`. I've moved it to a new question http://stackoverflow.com/questions/42156296/ – martianwars Feb 10 '17 at 09:56

1 Answers1

0

It seems as if this question was already answered in the comments but no-one has written down the answer explicitly yet, so here it is:

You were expecting the computation graph to return always the same values even with different random seeds because you thought that there should not be any Op in your graph which depends on the random seed.

You forgot about the dropout.

In any case, I would always keep the random seed fixed anyway. Then also this and any other random ops are deterministic, and your whole training can be as well. If you are wondering at some point how much variance you get by different random seeds, you can explicitly try other random seeds.

Albert
  • 65,406
  • 61
  • 242
  • 386