For a simple version of this -
while read ip
do ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log 2>&1 &
done < IPs
The &
on the end puts it in background and lets the loop run the next iteration while that one is processing. As an added point, I also redirected the stderr of each to the same log (2>&1
) so they wouldn't get lost of something failed.
$: ls x a # x exists, a doesn't
ls: cannot access 'a': No such file or directory
x
$: ls x a > log # send stdout to log, but error still goes to console
ls: cannot access 'a': No such file or directory
$: cat log # log only has success message
x
$: ls x a > log 2>&1 # send stderr where stdout is going - to same log
$: cat log # now both messages in the log
ls: cannot access 'a': No such file or directory
x
I also switched to a while read
to avoid needing the cat
in the for
, but that's mostly stylistic preference.
For a more load-aware version, use wait
.
I made a implistic control file that just has a letter per line -
$: cat x
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
Then declared a couple of values - a max I want it to fire at once, and a counter.
$: declare -i cnt=0 max=10
Then I typed in a read loop to iterate over the values, and run a set at a time. Until it accumulates the stated max, it keeps adding processes in background and counting them. Once it gets enough, it waits for those to finish and resets the counter before continuing with another set.
$: while read ctl # these would be your IP's
> do if (( cnt++ < max )) # this checks for max load
> then echo starting $ctl # report which we're doing
> date # throw a timestamp
> sleep 10 & # and fire the task in background
> else echo letting that batch work... # when too many running
> cnt=0 # reset the counter
> wait # and thumb-twiddle till they all finish
> echo continuing # log
> date # and timestamp
> fi
> done < x # the whole loop reads from x until done
Here's the output.
starting a
Thu, Oct 25, 2018 8:13:34 AM
[1] 10436
starting b
Thu, Oct 25, 2018 8:13:34 AM
[2] 7544
starting c
Thu, Oct 25, 2018 8:13:34 AM
[3] 10296
starting d
Thu, Oct 25, 2018 8:13:34 AM
[4] 6244
starting e
Thu, Oct 25, 2018 8:13:34 AM
[5] 8560
starting f
Thu, Oct 25, 2018 8:13:35 AM
[6] 8824
starting g
Thu, Oct 25, 2018 8:13:35 AM
[7] 11640
starting h
Thu, Oct 25, 2018 8:13:35 AM
[8] 9856
starting i
Thu, Oct 25, 2018 8:13:35 AM
[9] 7612
starting j
Thu, Oct 25, 2018 8:13:35 AM
[10] 9100
letting that batch work...
[1] Done sleep 10
[2] Done sleep 10
[3] Done sleep 10
[4] Done sleep 10
[5] Done sleep 10
[6] Done sleep 10
[7] Done sleep 10
[8] Done sleep 10
[9]- Done sleep 10
[10]+ Done sleep 10
continuing
Thu, Oct 25, 2018 8:13:45 AM
starting l
Thu, Oct 25, 2018 8:13:45 AM
[1] 8600
starting m
Thu, Oct 25, 2018 8:13:45 AM
[2] 516
starting n
Thu, Oct 25, 2018 8:13:45 AM
[3] 3296
starting o
Thu, Oct 25, 2018 8:13:45 AM
[4] 8608
starting p
Thu, Oct 25, 2018 8:13:46 AM
[5] 4040
starting q
Thu, Oct 25, 2018 8:13:46 AM
[6] 7476
starting r
Thu, Oct 25, 2018 8:13:46 AM
[7] 4468
starting s
Thu, Oct 25, 2018 8:13:46 AM
[8] 4144
starting t
Thu, Oct 25, 2018 8:13:46 AM
[9] 8956
starting u
Thu, Oct 25, 2018 8:13:46 AM
[10] 6864
letting that batch work...
[1] Done sleep 10
[2] Done sleep 10
[3] Done sleep 10
[4] Done sleep 10
[5] Done sleep 10
[6] Done sleep 10
[7] Done sleep 10
[8] Done sleep 10
[9]- Done sleep 10
[10]+ Done sleep 10
continuing
Thu, Oct 25, 2018 8:13:56 AM
starting w
Thu, Oct 25, 2018 8:13:56 AM
[1] 5520
starting x
Thu, Oct 25, 2018 8:13:56 AM
[2] 6436
starting y
Thu, Oct 25, 2018 8:13:57 AM
[3] 12216
starting z
Thu, Oct 25, 2018 8:13:57 AM
[4] 8468
And when finished, the last few are still running because I didn't go to the trouble of writing all this to an actual script with meticulous checking.
$: ps
PID PPID PGID WINPID TTY UID STIME COMMAND
11012 10944 11012 11040 pty0 2136995 07:59:35 /usr/bin/bash
6436 11012 6436 9188 pty0 2136995 08:13:56 /usr/bin/sleep
5520 11012 5520 10064 pty0 2136995 08:13:56 /usr/bin/sleep
12216 11012 12216 12064 pty0 2136995 08:13:57 /usr/bin/sleep
8468 11012 8468 10100 pty0 2136995 08:13:57 /usr/bin/sleep
9096 11012 9096 10356 pty0 2136995 08:14:03 /usr/bin/ps
This does cause burst loads that (for tasks that don't all finish at about the same time) will dwindle till the last is done, causing spikes and lulls. With a little more finesse we could write a waitpid
trap that would fire a new job each time one finished to keep the load steady, but that's an exercise for another day unless someone just really wants to see it. (I did it in Perl before, and have kind of always wanted to implement it in bash just because...)
Because it was requested -
Obviously, as presented in other posts, you could just use parallel
... but as an exercise, here's one way you could set a number of process chains that would read from a queue. I opted for simple callback rather than dealing with a SIGCHLD trap because there are a lot of little subprocs flying around...
Refinements welcome if anyone cares.
#! /bin/env bash
trap 'echo abort $0@$LINENO; die; exit 1' ERR # make sure any error is fatal
declare -i primer=0 # a countdown of how many processes to pre-spawn
use="
$0 <#procs> <cmdfile>
Pass the number of desired processes to prespawn as the 1st argument.
Pass the command file with the list of tasks you need done.
Command file format:
KEYSTRING:cmdlist
where KEYSTRING will be used as a unique logfile name
and cmdlist is the base command string to be run
"
die() {
echo "$use" >&2
return 1
}
case $# in
2) primer=$1
case "$primer" in
*[^0-9]*) echo "INVALID #procs '$primer'"
die;;
esac
cmdfile=$2
[[ -r "$cmdfile" ]] || die
declare -i lines=$( grep -c . $cmdfile)
if (( lines < primer ))
then echo "Note - command lines in $cmdfile ($lines) fewer than requested process chains ($primer)"
die
fi ;;
*) die ;;
esac >&2
trap ': no-op to ignore' HUP # ignore hangups (built-in nohup without explicit i/o redirection)
spawn() {
IFS="$IFS:" read key cmd || return
echo "$(date) executing '$cmd'; c.f. $key.log" | tee $key.log
echo "# autogenerated by $0 $(date)
{ $cmd
spawn
} >> $key.log 2>&1 &
" >| $key.sh
. $key.sh
rm -f $key.sh
return 0
}
while (( primer-- )) # until we've filled the requested quota
do spawn # create a child process
done < $cmdfile
Yes, there are security concerns with reading possibly dirty data and sourcing it. I wanted to keep the framework simple as an exercise. Suggestions are still welcome.
I threw together a quick command file with some complex commands built of simple crap just as examples.
a:for x in $( seq 1 10 );do echo "on $x";date;sleep 1;done &
b:true && echo ok || echo no
c:false && echo ok || echo no
d:date > /tmp/x; cat /tmp/x
e:date;sleep 5;date
f:date;sleep 13;date
g:date;sleep 1;date
h:date;sleep 5;date
i:date;sleep 17;date
j:date;sleep 1;date
k:date;sleep 9;date
l:date;sleep 19;date
m:date;sleep 7;date
n:date;sleep 19;date
o:date;sleep 11;date
p:date;sleep 17;date
q:date;sleep 6;date
r:date;sleep 7;date
s:date;sleep 18;date
t:date;sleep 6;date
u:date;sleep 9;date
v:date;sleep 9;date
w:date;sleep 2;date
x:date;sleep 0;date
y:date;sleep 3;date
z:date;sleep 10;date
Note the first one even runs itself in background - the spooler doesn't care. Job a will start b before spool
can, so it will skip to c.
Some of the logs -
a - original spawn; ran itself in background and immediately started b, then kept logging
Thu, Oct 25, 2018 2:33:57 PM executing 'for x in $( seq 1 10 );do echo "on $x";date;sleep 1;done &'; c.f. a.log
on 1
Thu, Oct 25, 2018 2:33:58 PM executing 'true && echo ok || echo no'; c.f. b.log
Thu, Oct 25, 2018 2:33:58 PM
on 2
Thu, Oct 25, 2018 2:33:59 PM
on 3
Thu, Oct 25, 2018 2:34:00 PM
on 4
Thu, Oct 25, 2018 2:34:01 PM
on 5
Thu, Oct 25, 2018 2:34:02 PM
on 6
Thu, Oct 25, 2018 2:34:04 PM
on 7
Thu, Oct 25, 2018 2:34:05 PM
on 8
Thu, Oct 25, 2018 2:34:06 PM
on 9
Thu, Oct 25, 2018 2:34:07 PM
on 10
Thu, Oct 25, 2018 2:34:08 PM
b - exited quickly and started f because c, d, & e had already been run
Thu, Oct 25, 2018 2:33:58 PM executing 'true && echo ok || echo no'; c.f. b.log
ok
Thu, Oct 25, 2018 2:33:58 PM executing 'date;sleep 13;date'; c.f. f.log
c - original spawn; finished before b, so it started d, which is why b started f
Thu, Oct 25, 2018 2:33:58 PM executing 'false && echo ok || echo no'; c.f. c.log
no
Thu, Oct 25, 2018 2:33:58 PM executing 'date > /tmp/x; cat /tmp/x'; c.f. d.log
d - started by c, finished and started h because g had already been run
Thu, Oct 25, 2018 2:33:58 PM executing 'date > /tmp/x; cat /tmp/x'; c.f. d.log
Thu, Oct 25, 2018 2:33:58 PM
Thu, Oct 25, 2018 2:33:59 PM executing 'date;sleep 5;date'; c.f. h.log
e - original spawn, started n because everything up to that had been run
Thu, Oct 25, 2018 2:33:58 PM executing 'date;sleep 5;date'; c.f. e.log
Thu, Oct 25, 2018 2:33:58 PM
Thu, Oct 25, 2018 2:34:04 PM
Thu, Oct 25, 2018 2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log
(skipping ahead a bit...)
n - started by e, took long enough to finish there were no more tasks to start
Thu, Oct 25, 2018 2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log
Thu, Oct 25, 2018 2:34:04 PM
Thu, Oct 25, 2018 2:34:23 PM
It works. It isn't perfect, but it could be handy. :)