0

I am using this code (please excuse its messiness) to run on my CPU. I have a custom RL environment that I have created myself and I am using DQN agent.

But when I run this code on GPU, it doesn't utilize much of it and in fact it is slower than my CPU.

This is the output of nvidia-smi. As you can see my processes are running on GPU but the speed is much slower than I would expect.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp            Off  | 00000000:00:05.0 Off |                  N/A |
| 23%   37C    P2    60W / 250W |  11619MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp            Off  | 00000000:00:06.0 Off |                  N/A |
| 23%   29C    P8     9W / 250W |    157MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     25540      C   python3                                    11609MiB |
|    1     25540      C   python3                                      147MiB |
+-----------------------------------------------------------------------------+

Can anyone point out what can I do to change my code for GPU capabilities?

PS: Notice that I have two GPUs and my process is running on both of them. Even if I use any one of two GPUs, the issue is that my GPU is not utilized and the speed is comparatively slower than GPU so two GPUs is not the issue

  • You could use [tf.distribute.MirroredStrategy](https://keras.io/guides/distributed_training/#singlehost-multidevice-synchronous-training) to utilize both GPUs, but I noticed that your implementation does not use Keras at all, even though you added it to the tags. – yudhiesh Jan 20 '21 at 14:30
  • Have you seen this: https://stackoverflow.com/questions/45662253/can-i-run-keras-model-on-gpu? – nsidn98 Jan 20 '21 at 15:14
  • I think proper recommended cuda has not been installed. Cross check cuda. – ML85 Jan 20 '21 at 16:01
  • My mistake, I am using `tf.keras` @yudhiesh and my issue is not that I can't use both GPUs, even one is being utilized well. @nsidn98 have seen this yes but its not relevant - my model is running on GPU but the processes are somehow not parallelized or maybe something else. – M. Awais Jadoon Jan 20 '21 at 16:08
  • @ML85 doesn't it say `CUDA Version: 10.2`? Is there anything that I should check? thanks – M. Awais Jadoon Jan 20 '21 at 17:01
  • Test with 10.0 or so, because the libraries you are using must have information about the cuda info and its optimized tensorflow version. – ML85 Jan 21 '21 at 09:16
  • 1
    I would recommend profiling using the tensorflow profiler. To set it up and see how it works see this: https://www.tensorflow.org/guide/profiler#profiling_apis If I had to guess, you are using the GPUs, but not a lot because you are using mostly scipy/numpy code instead of tensorflow functions on tensors. Furthermore do consider that in RL in general you tend to have less GPU utilization since the environment cannot run on GPU and it's slow python code (so the agent ends up waiting a lot). I would strongly recommend using the TF-Agents framework – Federico Malerba Jan 26 '21 at 08:56

0 Answers0