0

I have a Mobilenet build on Keras. Running it locally takes around 290 seconds every step, but when I run on the GCLoud ML Engine it takes over 400 seconds.I put the following line on my code:

K.tensorflow_backend._get_available_gpus()

And the log is the following:

['/job:localhost/replica:0/task:0/device:GPU:0']

I have also tried to change from 1 GPU to 4 GPU but the result is the same. Do I have to change any code to optimize Keras on GPU?

Dani Gonzalez
  • 93
  • 1
  • 7
  • This can happen if your batch size is not large enough, I have explained the same in this Answer https://stackoverflow.com/questions/54295934/best-way-to-import-data-in-google-colaboratory-for-fast-computing-and-training/54453951#54453951 – anand_v.singh Feb 13 '19 at 16:14
  • Thanks for the response @anand_v.singh , but that does not seem to be the solution. I have increased the batch size and the execution time does not change. – Dani Gonzalez Feb 13 '19 at 16:44
  • IO could be the bottleneck. What's your data format and how do you read them? – Guoqing Xu Feb 13 '19 at 19:42
  • My data are images of 224x224 and are read with this function https://github.com/slequeux/xke-cloudml/blob/master/cloud_train/trainer/keras_gs.py – Dani Gonzalez Feb 15 '19 at 10:22
  • Which GPUs are you using locally and on ML Engine? Please reach out to cloudml-feedback@google.com if you continue to experience issues training your models. – rpasricha Mar 16 '19 at 00:50

0 Answers0