Check if GPU is shared

Question

When the GPU is shared with other processes (e.g. Xorg or other CUDA procs), a CUDA process better should not consume all remaining memory but dynamically grow its usage instead.

(There are various errors you might get indirectly from this, like Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR. But this question is not about that.)

(In TensorFlow, you would use allow_growth=True in the GPU options to accomplish this. But this question is not about that.)

Is there a simple way to check if the GPU is currently used by other processes? (I'm not asking whether it is configured to be used for exclusive access.)

I could parse the output nvidia-smi and look for other processes. But that seems somewhat hacky and maybe not so reliable, and not simple enough.

(My software is using TensorFlow, so if TensorFlow provides such a function, nice. But if not, I don't care if this would be a C API or Python function. I would prefer to avoid other external dependencies though, except those I'm anyway using, like CUDA itself, or TensorFlow. I'm not afraid to use ctypes. So consider this question language invariant.)

Albert · Accepted Answer · 2021-03-08T10:49:24.237

1

There is nvmlDeviceGetComputeRunningProcesses and nvmlDeviceGetGraphicsRunningProcesses. (Documentation.) This is a C API, but I could use pynvml if I don't care about the extra dependency. Example usage (via).

edited Mar 08 '21 at 10:49

answered Mar 08 '21 at 10:30

Albert

65,406
61
242
386

1

You can also check (at least via CUDA runtime API) the amount of memory of the GPU and how much of that is currently being used by all prrocesses – Ander Biguri Mar 08 '21 at 11:51

Check if GPU is shared

1 Answers1