0

I came across from this post: How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?

But when I run ./mps_run before I launch the MPS, I got

kernel duration: 4.999370s
kernel duration: 5.012310s

And when I check nvidia-smi in 5 secs:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000001:00:00.0 Off |                    0 |
| N/A   28C    P0    38W / 250W |    508MiB / 16280MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Looks like the GPU I am using supports multi-processing somehow,

When I run nvidia-smi -i 2 -c EXCLUSIVE_PROCESS, turned out No devices were found

This is weird.

How do I know my GPU supports multiprocessing or not?

The GPU I am using: Tesla P100 (GP100GL)

Andy Huang
  • 347
  • 2
  • 13

1 Answers1

1

In that post you linked, in the UPDATE section of my answer, I indicated that the GPU scheduler has changed in Pascal and beyond (your Tesla P100 is a Pascal GPU).

MPS is supported on all current NVIDIA GPUs.

The results you got are expected (in the non-MPS case) because the GPU scheduler allows both kernels to run, in a time-sliced fashion. All currently supported CUDA GPUs support multiprocessing (in Default compute mode). However the older GPUs (e.g. Kepler) would run the kernel from one process, then the kernel from the other process. Pascal and newer GPUs will run the kernel from one process for a period of time, then the other process for a period of time, then the first process, etc in a round-robin time-sliced fashion.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257