I have a server with Ubuntu 16.04 installed. It has a K80 GPU. Multiple processes are using the GPU.
Some processes have unpredictable GPU usage, and I want to reliably monitor their GPU usage.
I know that you can query GPU usage via: nvidia-smi
, but that only gives you the usage at the queried time.
Currently I query the information every 100 ms, but that's just sampling the GPU usage, and can potentially skip peak GPU usage.
Is there a reliable way for me to get the maximum GPU memory usage for a given PID process?