1

I am using GNU parallel, and it is running jobs in parallel, but on way fewer than the total number of threads I told it to use with -j.

I ran it this way:

cat untar_my_folders.jobfile | parallel -j 60

and the jobfile is very simple, it just has ~ 500 lines that look like this:

tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar

I checked, and parallel does recognize all of the processors on the server:

$ parallel --number-of-cores
80

But when I use top I can see that it is only running ~20 jobs at once.

Thank you for any suggestions!

[edit] The version and OS info:

  • GNU parallel 20161222
  • Operating System: Ubuntu 18.04.6 LTS
  • Kernel: Linux 4.15.0-206-generic
rrr
  • 1,914
  • 2
  • 21
  • 24
  • That's looking 6+ years old... have you considered upgrading your **GNU Parallel**? Or try a quick run with a later version under `docker`. – Mark Setchell Apr 12 '23 at 16:26
  • 1
    Do you get any different behavior when using `-j 100%` and/or `--use-cores-instead-of-threads`? What does `parallel --number-of-threads` / `--number-of-cores` print? – Socowi Apr 12 '23 at 18:16
  • Update: Is it possible that this has to do with the `tar` command? `parallel -j 60` seems to be using all 60 threads when I ran it with a `grep` command instead – rrr Apr 12 '23 at 18:59
  • 1
    are you really running 500 copies of the ***same exact*** `tar` command? – markp-fuso Apr 12 '23 at 19:07
  • @markp-fuso no LOL, I made simple filenames for the example! – rrr Apr 12 '23 at 19:13
  • Do not use top but rather `ps -aux | grep "tar xvf" | wc` – Tinmarino Apr 12 '23 at 22:13
  • U could try using `sed` with `"%d~60w/dev/fd/%d'`, based on `https://stackoverflow.com/a/75548027/1765658` . I will try to build this if you're interested! – F. Hauri - Give Up GitHub Apr 13 '23 at 18:00

1 Answers1

2

I see this all the time, and @Tinmarino points to the issue, but does not explain why.

top shows processes that take up a lot of CPU time. tar, however, does not. tar takes very little CPU time, but a lot of disk I/O, and top does not show this.

iotop can help or iostat -dkx 1. I have a bash function:

IO() {
    string="${1:-sd}";
    iostat -dkx 1 | perl -ne 'BEGIN { $| = 1; $string = shift }
            s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$3$9$21/
,           ||
            s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$4$5$16/
            ||
            s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$6$7$14/;
            /Device/ and print and next;
            m^$string^ and print;
        ' $string
}

This way I can run IO to see all devices called 'sd' or IO sda to only show how busy /dev/sda is.

My guess is that you will see that IO will show at least one disk maxing out, and that ps -aux | grep "tar xvf" will show the correct amount of jobs running - most of them waiting for disk I/O.

How to improve this: copy all your files to a RAM disk. In that case your CPU will be waiting less for disk I/O:

top - 10:10:45 up 7 min,  3 users,  load average: 72.80, 54.00, 27.52
Tasks: 1229 total,  82 running, 1147 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.1 us, 94.9 sy,  0.0 ni,  0.3 id,  0.0 wa,  0.0 hi,  0.8 si,  0.0 st
GiB Mem :    503.9 total,      1.6 free,     24.9 used,    477.4 buff/cache
GiB Swap:    200.0 total,    184.2 free,     15.8 used.      5.4 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND            
    493 root      20   0       0      0      0 R  98.2   0.0   0:51.74 kswapd3            
 334367 tange     20   0    8152   3192   2996 R  92.7   0.0   0:13.70 tar                
 334717 tange     20   0    8152   3148   2956 R  92.4   0.0   0:12.28 tar                
 334143 tange     20   0    8152   3176   2984 R  92.0   0.0   0:13.11 tar                
 334114 tange     20   0    8152   3172   2976 R  90.5   0.0   0:13.41 tar                
 334285 tange     20   0    8152   1276   1144 R  89.9   0.0   0:13.72 tar                
 334701 tange     20   0    8152   3188   2996 R  89.0   0.0   0:13.50 tar                
   2316 root      20   0 6976856 840240 823248 S  88.7   0.2   5:45.89 containerd         
 334632 tange     20   0    8152   3168   2976 R  87.2   0.0   0:12.52 tar                
 334803 tange     20   0    8152   1244   1112 R  86.9   0.0   0:13.99 tar                
 334368 tange     20   0    8152   3168   2976 R  86.2   0.0   0:13.25 tar                
 334419 tange     20   0    8152   3172   2976 R  86.2   0.0   0:13.73 tar                
 334499 tange     20   0    8152   3152   2956 R  84.7   0.0   0:12.15 tar                
 334433 tange     20   0    8152   3152   2956 R  84.4   0.0   0:12.75 tar                
 334483 tange     20   0    8152   3152   2960 R  84.1   0.0   0:13.23 tar                
 334082 tange     20   0    8152   3152   2956 R  83.8   0.0   0:12.80 tar                
 334653 tange     20   0    8152   3188   2996 R  83.5   0.0   0:12.87 tar                
 334728 tange     20   0    8152   3156   2960 R  83.5   0.0   0:12.80 tar
 334404 tange     20   0    8152   3164   2972 R  83.2   0.0   0:12.44 tar
 334206 tange     20   0    8152   3184   2988 R  82.6   0.0   0:13.17 tar
 334432 tange     20   0    8152   3128   2932 R  82.6   0.0   0:12.41 tar
 334100 tange     20   0    8152   3160   2968 R  82.3   0.0   0:13.77 tar
 334315 tange     20   0    8152   3160   2964 R  82.3   0.0   0:12.55 tar
 334587 tange     20   0    8152   3148   2956 R  82.3   0.0   0:12.22 tar
 334759 tange     20   0    8152   3148   2956 R  81.7   0.0   0:12.60 tar
 334078 tange     20   0    8152   1240   1112 R  81.3   0.0   0:13.37 tar
 334294 tange     20   0    8152   1244   1112 R  81.0   0.0   0:13.36 tar
 334434 tange     20   0    8152   3148   2956 R  80.7   0.0   0:13.28 tar
Ole Tange
  • 31,768
  • 5
  • 86
  • 104