2

I am using jobs command to control the number of compute-intensive processes. I want to not run more than max_cnt processes at a time, and stop only when all the processes have stopped.

I use below bash script to accomplish this. However, this code always lists one process as running even after everything has executed and stopped.

Moreover, I can't find that process listed in htop's list of processes. What should I do or where should I look for that process that is listed by the result of echo $(jobs -p) command and how should I fix this issue of not exiting even when everything stops.

#!/usr/bash

SLEEP=5
max_cnt=8

# generate a random number less than eq $1
function random {
    rand=$RANDOM
    while [ "$rand" -gt $1 ]
    do
        rand=$RANDOM
    done
}

function job {
    # resource intensive job simulated by random sleeping
    echo param1="$1",param2="$2"
    random 20
    echo Sleeping for $rand
    sleep $rand
}

for param1 in 1e-6 1e-5 1e-4 1e-3 1e-2
do
    for param2 in "ones" "random"
    do
        echo starting job with $param1 $param2
        job $param1 $param2 &
        while [ "$(jobs -p|wc -l)" -ge "$max_cnt" ]
        do
            echo "current running jobs.. $(jobs -p|wc -l) ... sleeping"
            sleep $SLEEP
        done
    done
done

while [ "$(jobs -p|wc -l)" -ge 1 ]
do
    echo "current running jobs.. $(jobs -p|wc -l) ... sleeping"
    sleep $SLEEP
    echo $(jobs -p)
done
Umang Gupta
  • 15,022
  • 6
  • 48
  • 66
  • 1
    `echo $(jobs -p)` is a useless use of `echo`. Just `jobs -p`. I think in scripts you have to enable job control with `set -m` or `set +m`. Using `xargs` like `xargs -P"$max_cnt" bash -c 'job "$@"'` is an easy way to parallelize work – KamilCuk Feb 20 '20 at 00:17
  • In general, in a script, you can implement your own job control by storing `$!` for each job you've run (for example, as keys in an associative array). `jobs` was designed for interactive use, and for that matter is specified only in the POSIX user portability extensions annex; it's not suitable for scripts, or even guaranteed to be available at all in a shell compiled with interactive features turned off. – Charles Duffy Feb 20 '20 at 00:20
  • Also, about the `function` keyword -- see https://wiki.bash-hackers.org/scripting/obsolete, particularly the entry about it in the last table in the page (describing nonportable syntax that should be used only with a specific reason to do so). – Charles Duffy Feb 20 '20 at 00:22
  • 1
    See https://stackoverflow.com/a/356154/14122 for an example of tracking background processes without using `jobs`. The `wait` command returns the exit status of the job it's waiting for, so one can track which individual jobs succeeded and failed; if one added `unset "pids[$i]"` after the `wait`, one would also have a count of the number of jobs started but not yet reaped available to check. – Charles Duffy Feb 20 '20 at 00:26
  • @CharlesDuffy Thanks for the comments. Any clue on why that 1 process exists – Umang Gupta Feb 20 '20 at 01:14
  • @KamilCuk I realized and fix it later (in code not posted here). I am a beginner. can you please describe in detail the second part of comment. – Umang Gupta Feb 20 '20 at 01:17
  • 1
    Consider using **GNU Parallel** to run jobs in parallel. If you want 8 jobs running at a time, `parallel -j8 < ListOfJobs.txt` – Mark Setchell Feb 20 '20 at 01:18
  • @MarkSetchell Thanks for the tip. But I don't think I will be able to use `for` loops and variables like I am doing above. Also using bash gives more control for example: monitoring resource and deciding to execute or not. – Umang Gupta Feb 20 '20 at 01:22
  • 1
    You don't need to. **GNU Parallel** will do it for you. `parallel echo ::: a b c ::: 1 2 3` is the equivalent of nested `for` loops over a,b,c and 1,2,3 – Mark Setchell Feb 20 '20 at 01:25
  • 1
    **GNU Parallel** can decide scheduling based on CPU load or memory pressure for you, and distribute load across multiple servers, and handle fail/restarts, and tag output, and show progress meters... just saying. – Mark Setchell Feb 20 '20 at 01:28

1 Answers1

2

As mentioned in the comments, you may want to consider using GNU Parallel, it makes life easier when managing parallel jobs. Your code could look like this:

#!/usr/bin/env bash

function job {
    # resource intensive job simulated by random sleeping
    echo param1="$1",param2="$2"
    ((s=(RANDOM%5)+1))
    echo Sleeping for $s
    sleep $s
}
# export function to subshells
export -f job

parallel -j8 job {1} {2} ::: 1e-6 1e-5 1e-4 1e-3 1e-2 ::: "ones" "random"

Sample Output

param1=1e-6,param2=ones
Sleeping for 1
param1=1e-5,param2=ones
Sleeping for 1
param1=1e-2,param2=ones
Sleeping for 1
param1=1e-4,param2=ones
Sleeping for 2
param1=1e-4,param2=random
Sleeping for 2
param1=1e-6,param2=random
Sleeping for 4
param1=1e-2,param2=random
Sleeping for 3
param1=1e-3,param2=random
Sleeping for 4
param1=1e-5,param2=random
Sleeping for 5
param1=1e-3,param2=ones
Sleeping for 5

There are many other switches and parameters:

  • parallel --dry-run ... will show you what it would do, without actually doing anything

  • parallel --eta ... which gives you an "Estimated Time of Arrival"

  • parallel --bar ... which gives you a progress bar
  • parallel -k ... which keeps output in order
  • parallel -j 8 ... which runs 8 jobs at a time rather than the default of 1 job per CPU core
  • parallel --pipepart ... which will split the contents of a massive file across subprocesses

Note also that GNU Parallel can distribute work across other machines in your network, and it has fail and retry handling, output tagging and so on...

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Thanks! `parallel` is interesting and almost solves my problem. However, I am still curious about job not being listed, hence I will wait before accepting this solution. – Umang Gupta Feb 20 '20 at 09:59
  • Which job is not being listed? You may want to use `parallel -k ...` to keep the output in order so you can check it. – Mark Setchell Feb 20 '20 at 10:07
  • See "Moreover, I can't find that process listed in htop's list of processes..... " this issue. `parallel` is fine, I want to understand the problem with my understanding of `jobs -p` – Umang Gupta Feb 20 '20 at 10:10