Starting multiple processes in parallel and terminating all if not finished within X seconds

Question

I am trying to create a script to run on a hardware firewall. The firewall is running a hardened Linux which is lacking some tools, for example timeout and parallel.

The scripts collects IP addresses from the netstat commands, based on some criteria, and feeds these IPs into the script.

The script itself will execute a command per IP address in parallel, and after 20 seconds terminate all commands not already finished.

The output of all the commands needs to go to the terminal on standard output.

The closest I have come is this script, which I found and modified somewhat to fit my needs:

#!/bin/bash

for cmd in "$@"; do {
    echo "Process \"$cmd\" started";
    $cmd & pid=$!
    PID_LIST+=" $pid";
} done

echo "Parallel processes have started";
sleep 5
for id in $PID_LIST
do
    kill $id
done


echo
echo "All processes have completed";

This script does what I need, however it needs the indata to be in arguments, like:

./script.sh "command IP" "command IP"

My first idea was to use "read" to pipe the IP addresses into the command, so I replaced the for loop in it with a

while read cmd; do... (and the rest of the script above on one line)

The problem that occurred was that the PID_LIST variable did not survive outside of the while loop, like it does in the above example with a for loop. This mean that the script could not monitor the PIDs of the command started and kill them after the timeout.

So my questions is:

Is there a reason a variable declared in a for loop is still set outside of the loop, and not in a while loop?
Can I pipe in data into this script while still keeping the for loop? For example in a similar way that I did with the while loop, using "while read cmd do", or in some other way

I have thought about trying to get the data I get out from my netstat command into an array, and replacing "$@" with the array, but I have not succeeded.

I have read Command line command to auto-kill a command after a certain amount of time but I do not feel that it answers enough of my questions to make me reach my target. Seems like most scripts there are targeted at running one command only, and not several in parallel.

EDIT:

In order to give some more info I experimented some more and noticed something that might be the cause of my concerns. The end script needs to be run on one line, together with the other part collecting the info from netstat. But when I have tested it I have put the script in a file and piped

myoutput | ./paralell.sh

This works well, just like you said. Even when parallel.sh has all the code on one line.

For reference, this is what is inside of parallel.sh

while read cmd; do { echo "Process \"$cmd\" started"; $cmd & pid=$!  PID_LIST+=" $pid"; } done ; echo "Parallel processes have started"; sleep 5; for id in $PID_LIST ; do  kill $id ; done

But when I take that line out from parallel.sh and put it on the same commandline, it doesnt work

myoutput | while read cmd; do { echo "Process \"$cmd\" started"; $cmd & pid=$!  PID_LIST+=" $pid"; } done ; echo "Parallel processes have started"; sleep 5; for id in $PID_LIST ; do  kill $id ; done

So what is different between piping into the parallel.sh or having the content of the script on the same command line?

It seems what is going wrong is that $PID_LIST is empty outside of the while loop when running the script all on one line, while it seem to survive outside of the while loop when piping to ./parallel.sh

You should show _exactly_ how you wrote your `while`/`read` loop. There are no reasons (besides you making some coding errors, typically running the loop in a subshell) for the `while` loop to not behave as you want. — gniourf_gniourf, Dec 25 '16 at 18:58
I don't have much option here. Perl is not installed. Except for a few small issues I think bash would work pretty well. — Johnathan, Dec 25 '16 at 19:23
This looks like it's missing a semicolon, or put on separate lines really. Should be `$cmd &; pid=$!` — Mort, Dec 25 '16 at 20:33
Like I suspected in my comment above, you're running your while loop in a subshell. See this page: [BashFAQ/024](http://mywiki.wooledge.org/BashFAQ/024). To fix it: use `while read ...; done < <(myoutput); for id in $PID_LIST; do ...; done`. By the way, you should use an array instead of a space separated list for `PID_LIST`. — gniourf_gniourf, Dec 26 '16 at 10:46

score 1 · Answer 1 · answered Dec 26 '16 at 08:05

1

So it seems the solution is to use ()

as I said above, this works

myoutput | ./paralell.sh

But not this

myoutput | content of script

But what was missing is doing this

myoutput | (content of script)

answered Dec 26 '16 at 08:05

Johnathan

737
2
7
21

It would be helpful to others to explain the difference between these 3 examples and why the last one is better for this situation. – Joshua Briefman Dec 26 '16 at 09:03

score -1 · Answer 2 · answered Dec 26 '16 at 11:11

As suspected, you're running the while loop in a subshell, so the variables your create there are lost as soon as the subshell exits. See BashFAQ/024.

A cleaner way:

#!/bin/bash

# enable job control
set -m

while read cmd; do
    printf 'Process "%s" started\n' "$cmd"
    $cmd &
done < <(myoutput)

echo "Parallel processes have started"
sleep 5

while read pid; do
    kill $pid
done < <(jobs -rp)

wait

The changes with respect to your script:

The while loop is not executed in a subshell. We're using a process substitution.
Probably the most important change is that we're not saving the PIDs of the processes launched.
We retrieve the PIDs of the child processes using the jobs builtin. We kill them one by one. This is superior to your design since some jobs may have already terminated and you would kill already stopped jobs or, worse, other jobs that may have started and reused the same PID in the meantime!
At the end we wait for all jobs to terminate.

You will see some spam like

./myparallel: line 42: 12345 Terminated      $cmd

on stderr. If you want to get rid of these (and also of possible other error messages when you kill the processes at the end), you can wrap the final loop and the wait command like so:

{
    while read pid; do
        kill $pid
    done < <(jobs -rp)

    wait
} 2> /dev/null

Starting multiple processes in parallel and terminating all if not finished within X seconds

2 Answers2