Why does my keras neural network model output different values on a different machine?

Question

I am using aws ec2 to train a model for a multi-label classification task. After training , I tested the model on the same machine which gives me good results (accuracy of 90+%). However, after I imported the saved model into my local machine (no GPU), it gives different results (accuracy is less than 5%). Any suggestions on why this is happening? Thanks.

TL;DR: Keras/tensorflow model produces different results when transfered from GPU machine to CPU.

So what data are you training your model with and depends hhow to programmed it to do it. — user7568042, Feb 24 '17 at 22:56
I'm working with text data (tweets). I'm classifying them based on emotions they express(joy, sadness, anger, etc.). I am using keras over tensorflow. My network is a sequencial model (embedding layer-> bidirectional LSTM-> sigmoid dense layer). I'm using binary_crossentropy (since I want to have multiple 0/1 emotion output) and rmsprop for the optimiser. It's working great in aws but not in my local machine. — Vainglory Arcanine, Feb 24 '17 at 23:24
I already changed the version of keras and tensorflow on my local machine to same of that in aws. The only difference now is that I'm using a CPU version on my laptop while aws ec2 uses a gpu. However, output in my local machine still tends to give a very different output as compared to aws. — Vainglory Arcanine, Feb 24 '17 at 23:26
You should provide code and what results you are getting, else the question is completely unanswerable. — Dr. Snoopy, Feb 25 '17 at 09:56
Sir @MatiasValdenegro, My code is based from this [github code](https://github.com/alexander-rakhlin/CNN-for-Sentence-Classification-in-Keras/blob/master/trainGraph.py) posted by Sir Alexander Rakhlin. I modified it to use RNN instead for multiple emotion output. I was getting a good result of 90+% accuracy on the aws ec2 machine while only 5% on my laptop which doesn't have a GPU. I did some searching about this behavior and found that the cause is the cuDNN used on the GPU machine producing non-deterministic values which randomizes the model. I posted the solution I found with some links. — Vainglory Arcanine, Feb 26 '17 at 01:30

score 6 · Accepted Answer · answered Feb 26 '17 at 00:58

Upon searching the net, I found the problem. It seems like that keras over tensorflow when running on a GPU tends to produce results that are not reproducible when transfered to a non-GPU machine. This most likely has something to do with the cuDNN installed. cuDNN's maxpooling and some convolution backward algorithm is non-deterministic - as said from a forum.

Solutions I found say the use of numpy.random.seed(seed_no) right before calling any keras libraries. This works when you run the code on a CPU. Works with both keras/theano and keras/tensorflow.

Solution for GPU users using keras over theano involves modifying the .theanorc file into:

[dnn.conv]
algo_bwd_filter = deterministic
algo_bwd_data = deterministic

Or using theano flags: THEANO_FLAGS="dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic" python rnn_model.py

However, I haven't found any clear instructions yet of how to produce uniform results on a keras with tensorflow as back-end running on a GPU.

Hi, Thanks for your answer! What is the parameter seed_no for numpy.random.seed ? And do you know why calling it will stop cuDNN to be nondeterministic ? Thanks! — Pusheen_the_dev, Feb 28 '17 at 14:58
Sir @Pusheen_the_dev, seed_no is an integer variable you use to control the random state of numpy making the result predictable. You can see more info [here](http://stackoverflow.com/questions/21494489/what-does-numpy-random-seed0-do). As far as I know, keras uses numpy to randomize the initial weights of the model. Thus, setting a seed to control this random state will make the output of the model produce uniform results for experiments. However, seeding will not stop the cuDNN from being non-deterministic since it is the default setting when installed. You can use theano flags for this. — Vainglory Arcanine, Mar 03 '17 at 13:28

Why does my keras neural network model output different values on a different machine?

1 Answers1