5

I am using keras-rl to train my network with the D-DQN algorithm. I am running my training on the GPU with the model.fit_generator() function to allow data to be sent to the GPU while it is doing backprops. I suspect the generation of data to be too slow compared to the speed of processing data by the GPU.

In the generation of data, as instructed in the D-DQN algorithm, I must first predict Q-values with my models and then use these values for the backpropagation. And if the GPU is used to run these predictions, it means that they are breaking the flow of my data (I want backprops to run as often as possible).

Is there a way I can specify on which device to run specific operations? In a way that I could run the predictions on the CPU and the backprops on the GPU.

Raphael Royer-Rivard
  • 2,252
  • 1
  • 30
  • 53

2 Answers2

8

Maybe you can save the model at the end of the training. Then start another python file and write os.environ["CUDA_VISIBLE_DEVICES"]="-1"before you import any keras or tensorflow stuff. Now you should be able to load the model and make predictions with your CPU.

mss
  • 348
  • 2
  • 12
3

It's hard to properly answer your question without seeing your code.

The code below shows how you can list the available devices and force tensorflow to use a specific device.

def get_available_devices():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos]

get_available_devices()

with tf.device('/gpu:0'):
    //Do GPU stuff here
with tf.device('/cpu:0'):
    //Do CPU stuff here
VegardKT
  • 1,226
  • 10
  • 21
  • I already tried to use `with tf.device('/cpu:0')` just before executing `model.predict_on_batch()` and `with tf.device('/gpu:0')` before `model.fit_generator()` but it did nothing. Does it need to be used also when creating the model? I also tried to use tf as `keras.backend.tensorflow_backend` but then `device()` is not a method of that tensorflow. – Raphael Royer-Rivard Jul 25 '18 at 17:36
  • What does "did nothing" mean? Do you mean that it still used the GPU regardless of your set device? – VegardKT Jul 26 '18 at 09:38
  • Yes, it did nothing different from when I was not using these commands, which is use GPU for both predictions and fit. – Raphael Royer-Rivard Jul 26 '18 at 13:22
  • As far as i can see from the tensorflow documentation this should be the way to do it. You may want to try and specify the device when you compile your model as well, just in case. Quote from doc: "# # All operations constructed in this context will be placed # on GPU 0" A bithard to determine what they mean by "constructed" though. [Link: Device](https://www.tensorflow.org/api_docs/python/tf/Graph#device) – VegardKT Jul 26 '18 at 13:36
  • I just tried with the `Graph`'s `device` function instead of `tf.device` directly and I also compiled the models on their respective device but I saw no difference. I cannot find the `Graph()` method in the `keras.backend.tensorflow_backend` include, so using tensorflow directly might be a problem here. The documentation says `N.B. The device scope may be overridden by op wrappers or other library code. For example, a variable assignment op v.assign() must be colocated with the tf.Variable v, and incompatible device scopes will be ignored.` I don't know how Keras is using `tf.Variable`s... – Raphael Royer-Rivard Jul 26 '18 at 15:56
  • Not sure what's going on then, may have to see your code to be able to help any further. As far as I know, this is the way to do it. – VegardKT Jul 27 '18 at 07:23
  • The project I'm working on is so big, I don't really know what part of it I should show... Basically I create a `DQNAgent`, I make sure to call the `with tf.Graph().device('/cpu:0')` before compiling its models that make the predictions with `predict_on_batch` and `with tf.Graph().device('/device:GPU:0')` before compiling its model that will be trained with `fit_generator`. I also use these commands just before doing the predictions and the fit. – Raphael Royer-Rivard Jul 27 '18 at 13:51