6

I am using aws ec2 to train a model for a multi-label classification task. After training , I tested the model on the same machine which gives me good results (accuracy of 90+%). However, after I imported the saved model into my local machine (no GPU), it gives different results (accuracy is less than 5%). Any suggestions on why this is happening? Thanks.

TL;DR: Keras/tensorflow model produces different results when transfered from GPU machine to CPU.

  • So what data are you training your model with and depends hhow to programmed it to do it. – user7568042 Feb 24 '17 at 22:56
  • I'm working with text data (tweets). I'm classifying them based on emotions they express(joy, sadness, anger, etc.). I am using keras over tensorflow. My network is a sequencial model (embedding layer-> bidirectional LSTM-> sigmoid dense layer). I'm using binary_crossentropy (since I want to have multiple 0/1 emotion output) and rmsprop for the optimiser. It's working great in aws but not in my local machine. – Vainglory Arcanine Feb 24 '17 at 23:24
  • I already changed the version of keras and tensorflow on my local machine to same of that in aws. The only difference now is that I'm using a CPU version on my laptop while aws ec2 uses a gpu. However, output in my local machine still tends to give a very different output as compared to aws. – Vainglory Arcanine Feb 24 '17 at 23:26
  • 1
    You should provide code and what results you are getting, else the question is completely unanswerable. – Dr. Snoopy Feb 25 '17 at 09:56
  • Sir @MatiasValdenegro, My code is based from this [github code](https://github.com/alexander-rakhlin/CNN-for-Sentence-Classification-in-Keras/blob/master/trainGraph.py) posted by Sir Alexander Rakhlin. I modified it to use RNN instead for multiple emotion output. I was getting a good result of 90+% accuracy on the aws ec2 machine while only 5% on my laptop which doesn't have a GPU. I did some searching about this behavior and found that the cause is the cuDNN used on the GPU machine producing non-deterministic values which randomizes the model. I posted the solution I found with some links. – Vainglory Arcanine Feb 26 '17 at 01:30

1 Answers1

6

Upon searching the net, I found the problem. It seems like that keras over tensorflow when running on a GPU tends to produce results that are not reproducible when transfered to a non-GPU machine. This most likely has something to do with the cuDNN installed. cuDNN's maxpooling and some convolution backward algorithm is non-deterministic - as said from a forum.

Solutions I found say the use of numpy.random.seed(seed_no) right before calling any keras libraries. This works when you run the code on a CPU. Works with both keras/theano and keras/tensorflow.

Solution for GPU users using keras over theano involves modifying the .theanorc file into:

[dnn.conv]
algo_bwd_filter = deterministic
algo_bwd_data = deterministic

Or using theano flags: THEANO_FLAGS="dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic" python rnn_model.py

However, I haven't found any clear instructions yet of how to produce uniform results on a keras with tensorflow as back-end running on a GPU.

  • Hi, Thanks for your answer! What is the parameter seed_no for numpy.random.seed ? And do you know why calling it will stop cuDNN to be nondeterministic ? Thanks! – Pusheen_the_dev Feb 28 '17 at 14:58
  • Sir @Pusheen_the_dev, seed_no is an integer variable you use to control the random state of numpy making the result predictable. You can see more info [here](http://stackoverflow.com/questions/21494489/what-does-numpy-random-seed0-do). As far as I know, keras uses numpy to randomize the initial weights of the model. Thus, setting a seed to control this random state will make the output of the model produce uniform results for experiments. However, seeding will not stop the cuDNN from being non-deterministic since it is the default setting when installed. You can use theano flags for this. – Vainglory Arcanine Mar 03 '17 at 13:28