1

I would like to run several instances of a program on a varying number of input file in parallel. The program itself is not parallellised, this is why I am looking for a way to submit multiple instances. I'm aware of GNU parallel, however the bash script that I'm writing will be shared with my co-workers, and not all of them have it installed.

I found an answer that almost matches my needs here, however the number of processes there are hardcoded, so I can't use a here document. In my case, there will be a different number of input files, so I thought I can list them and then feed to xargs to execute them. I tried various ways but neither of them worked. Two of my attemps to modify the code from the link:

#!/bin/bash
nprocs=3
# Attempt one: use a loop
commands=$( for ((i=0; i<5; i++)); do echo "sleep $i; echo $i;"; done )
echo Commands:
echo $commands
echo
{
    echo $commands | xargs -n 1 -P $nprocs -I {} sh -c 'eval "$1"' - {}
} &
echo "Waiting for commands to finish..."
wait $!

# Attempt two: use awk, the rest as above
commands=$( awk 'BEGIN{for (i=1; i<5; i++) { printf("sleep %d && echo \"ps %d\";\n", i, i) }}' )

The commands are executed one after the other. What could be wrong? Thanks.

Suzanka
  • 127
  • 2
  • 9

2 Answers2

2

Try running just

xargs -n 1

to see what commands are being run.

To avoid problems with quoting, I'd use an array of commands.

#! /bin/bash
nprocs=3

commands=()
for i in {0..4} ; do
    commands+=("sleep 1; echo $i")
done

echo Commands:
echo "${commands[@]}"

printf '%s\n' "${commands[@]}" \
| xargs -n 1 -P $nprocs -I % bash -c % &

echo "Waiting for commands to finish..."
wait $!
choroba
  • 231,213
  • 25
  • 204
  • 289
1

parallel --embed (version >20180122) is made for your situation:

parallel --embed > newscript.sh

Now edit the last lines of newscript.sh and you have GNU Parallel included in your script that you can distribute.

Ole Tange
  • 31,768
  • 5
  • 86
  • 104