nvprof is using all available GPU's when profiling python script

Question

I am using a remote machine, which has 2 GPU's, in order to execute a Python script which has CUDA code. In order to find where I can improve the performance of my code, I am trying to use nvprof.

I have set on my code that I only want to use one of the 2 GPU's on the remote machine, although, when calling nvprof --profile-child-processes ./myscript.py, a process with the same ID is started on each of the GPU's.

Is there any argument I can give nvprof in order to only use one GPU for the profiling?

use the environment variable `CUDA_VISIBLE_DEVICES="0"` to restrict `nvprof` access. For example `CUDA_VISIBLE_DEVICES="0" nvprof --profile-child-processes ./myscript.py` would limit nvprof to the first GPU, and `CUDA_VISIBLE_DEVICES="1" nvprof --profile-child-processes ./myscript.py` would limit it to the 2nd GPU, etc. The env var is documented [here](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars). `nvprof --help` shows that it has a `--devices` switch which can limit certain activities to certain GPUs also. — Robert Crovella, Apr 06 '17 at 14:39

Robert Crovella · Accepted Answer · 2022-07-20T14:08:06.583

As you have pointed out, you can use CUDA profilers to profile python codes simply by having the profiler run the python interpreter, running your script:

nvprof python ./myscript.py

Regarding the GPUs being used, the CUDA environment variable CUDA_VISIBLE_DEVICES can be used to restrict the CUDA runtime API to use only certain GPUs. You can try it like this:

CUDA_VISIBLE_DEVICES="0" nvprof --profile-child-processes python ./myscript.py

Also, nvprof is documented and also has command line help via nvprof --help. Looking at the command-line help, I see a --devices switch which appears to limit at least some functions to use only particular GPUs. You could try it with:

nvprof --devices 0 --profile-child-processes python ./myscript.py

For newer GPUs, nvprof may not be the best profiler choice. You should be able to use nsight systems in a similar fashion, for example via:

nsys profile --stats=true python ....

Additional "newer" profiler resources are linked here.

nvprof is using all available GPU's when profiling python script

1 Answers1

Related