30

The output of nvidia-smi shows the list of PIDs which are running on the GPU:

Thu May 10 09:05:07 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111                Driver Version: 384.111                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 61%   74C    P2   195W / 250W |   5409MiB / 11172MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      5973      C   ...master_JPG/build/tools/program_pytho.bin  4862MiB |
|    0     46324      C   python                                       537MiB |
+-----------------------------------------------------------------------------+

How do I show the usernames associated with each process?

This shows the username of an individual PID:

ps -u -p $pid
miken32
  • 42,008
  • 16
  • 111
  • 154
Dang Manh Truong
  • 795
  • 2
  • 10
  • 35

7 Answers7

29

I created a script that takes nvidia-smi output and enriches it with some more information: https://github.com/peci1/nvidia-htop .

It is a python script that parses the GPU process list, parses the PIDs, runs them through ps to gather more information, and then substitutes the nvidia-smi's process list with the enriched listing.

Example of use:

$ nvidia-smi | nvidia-htop.py -l
Mon May 21 15:06:35 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.25                 Driver Version: 390.25                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:04:00.0 Off |                  N/A |
| 53%   75C    P2   174W / 250W |  10807MiB / 11178MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:05:00.0 Off |                  N/A |
| 66%   82C    P2   220W / 250W |  10783MiB / 11178MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:08:00.0 Off |                  N/A |
| 45%   67C    P2    85W / 250W |  10793MiB / 11178MiB |     51%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
|  GPU   PID     USER    GPU MEM  %MEM  %CPU  COMMAND                                                                                               |
|    0  1032 anonymou   10781MiB   308   3.7  python train_image_classifier.py --train_dir=/mnt/xxxxxxxx/xxxxxxxx/xxxxxxxx/xxxxxxx/xxxxxxxxxxxxxxx  |
|    1 11021 cannotte   10765MiB   114   1.5  python3 ./train.py --flagfile /xxxxxxxx/xxxxxxxx/xxxxxxxx/xxxxxxxxx/xx/xxxxxxxxxxxxxxx                |
|    2 25544 nevermin   10775MiB   108   2.0  python -m xxxxxxxxxxxxxxxxxxxxxxxxxxxxx                                                               |
+-----------------------------------------------------------------------------+
Martin Pecka
  • 2,953
  • 1
  • 31
  • 40
29

I did it with nvidia-smi -q -x which is XML style output of nvidia-smi

ps -up `nvidia-smi -q -x | grep pid | sed -e 's/<pid>//g' -e 's/<\/pid>//g' -e 's/^[[:space:]]*//'`
Junwon Lee
  • 301
  • 3
  • 3
  • 1
    This is great! When I try to alias this, bash tries to execute the PID as commands. Do you have a tip for aliasing it? – jay Jun 19 '19 at 01:57
  • @JayStanley, I'm by no means a Unix expert so this may not be the best solution, but FWIW, I found that I could export it into a variable like `export FOO="ps -up `nvidia-smi -q -x | grep pid | sed -e 's///g' -e 's/<\/pid>//g' -e 's/^[[:space:]]*//'`"` and then simply type $FOO at the command line....If this is a bad solution I'd love to hear more from people who know more about Unix. – RMurphy Oct 21 '20 at 14:48
  • Use a function to alias it `nvidia_smi_users() {ps -up \`nvidia-smi -q -x | grep pid | sed -e 's///g' -e 's/<\/pid>//g' -e 's/^[[:space:]]*//'\` ;}` – helperFunction Oct 06 '22 at 08:53
  • I like doing this: `nvidia-smi; ps -up `nvidia-smi -q -x | grep pid | sed -e 's///g' -e 's/<\/pid>//g' -e 's/^[[:space:]]*//'`` – Charlie Parker Feb 03 '23 at 23:31
  • this solves my issues: https://stackoverflow.com/a/75403918/1601580 – Charlie Parker Feb 09 '23 at 20:09
3

This is the best I could come up with:

nvidia-smi
ps -up `nvidia-smi |tail -n +16 | head -n -1 | sed 's/\s\s*/ /g' | cut -d' ' -f3` 

Sample output:

Thu May 10 15:23:08 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111                Driver Version: 384.111                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 41%   59C    P2   251W / 250W |   5409MiB / 11172MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1606      C   ...master_JPG/build/tools/program.bin       4862MiB |
|    0     15314      C   python                                       537MiB |
+-----------------------------------------------------------------------------+
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user111+  1606  134  4.8 32980224 789164 pts/19 Rl+ 15:23   0:08 /home/user111
user2     15314  0.4 10.0 17936788 1647040 pts/16 Sl+ 10:41   1:20 python server_

Short explanation of the script:

  • Tail and head to remove redundant lines

  • Sed to remove spaces (after this, each column would only be separated by 1 space)

  • Cut to extract the relevant columns

The output is a list of PIDs, each occupying 1 line. We only need to use ps -up to show the relevant information

UPDATE: A better solution:

ps -up `nvidia-smi |tee /dev/stderr |tail -n +16 | head -n -1 | sed 's/\s\s*/ /g' | cut -d' ' -f3`

This way, nvidia-smi would have to be called only once. See also:

How to output bash command to stdout and pipe to another command at the same time?

UPDATE 2: I've uploaded this to Github as a simple script for those who need detailed GPU information.

https://github.com/ManhTruongDang/check-gpu

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Dang Manh Truong
  • 795
  • 2
  • 10
  • 35
2

Previous solution doesn't work, so I post my solution here. The version of NVIDIA-SMI I am using is 440.44, but I don't think it matters.

nvidia-smi | tee /dev/stderr | awk '/ C / {print $3}' | xargs -r ps -up

A little explanation:

  • tee: avoid calling nvidia-smi twice
  • awk: grab PID columns of compute process (type C)
  • xargs -r: -r check if input is empty so that it can avoid undesirable error message from ps -up

If you want to make it an alias in .bash_profile or .bashrc:

alias nvidia-smi2='nvidia-smi | tee /dev/stderr | awk "/ C / {print \$3}" | xargs -r ps -up'

The difference is it has to escape before $3.

Iron
  • 44
  • 2
1

Jay Stanley, I could alias Junwon Lee's command using xargs as follows:

alias gpu_user_usage="nvidia-smi -q -x | grep pid | sed -e 's/<pid>//g' -e 's/<\/pid>//g' -e 's/^[[:space:]]*//' | xargs ps -up"

(I could not comment due to reputation limitations...)

SveborK
  • 57
  • 4
1

as the comment by Robert suggests, this answer https://stackoverflow.com/a/51406093/2160809 suggests using gpustat which i found really helpful

first install gpustats:
pip install gpustat

then you can run (more details: https://github.com/wookayin/gpustat#usage)

gpustat -up
[0] NVIDIA GeForce GTX 1080 Ti | 90'C,  73 % |  6821 / 11178 MB | user1/732124(6817M)
[1] NVIDIA GeForce GTX 1080 Ti | 63'C,   0 % |  7966 / 11178 MB | user2/268172(1287M) user3/735496(6675M)
[2] NVIDIA GeForce GTX 1080 Ti | 66'C,  13 % |  2578 / 11178 MB | user2/268478(1287M) user2/725391(1287M)
[3] NVIDIA GeForce GTX 1080 Ti | 58'C,   0 % |  1291 / 11178 MB | user2/726058(1287M)
cookiemonster
  • 1,315
  • 12
  • 19
0

Given my requirements that I want to display Pid, username, gpu id and process/app name, here is the solution:

Answer:

(echo "GPU_ID PID UID APP" ; for GPU in 0 1 2 3 ; do for PID in $( nvidia-smi -q --id=${GPU} --display=PIDS | awk '/Process ID/{print $NF}') ; do echo -n "${GPU} ${PID} " ; ps -up ${PID} | awk 'NR-1 {print $1,$NF}' ; done ; done) | column -t

credit: https://www.reddit.com/r/HPC/comments/10x9w6x/comment/j7sg7w2/?utm_source=share&utm_medium=web2x&context=3

Charlie Parker
  • 5,884
  • 57
  • 198
  • 323