59

I have a bash script that contains other scripts inside that are run in series. However, it takes a decent amount of time to run them all. Is there a way to run these scripts in parallel to improve overall perfomance? They are independent of each other.

It looks similar to:

#!/bin/bash

#some code here
cppcheck.sh
churn.sh
run.sh

Update:

**git log --pretty=format: --numstat | perl -ane'$c{$F[2]} += abs($F[0]+$F[1]) 
if $F[2];END {print "$_\t$c{$_}\n" for sort keys %c}' > ${OUTPUT_DIR}/churn.txt**
sed -i -e '/deps/d;/build/d;/translations/d;/tests/d' -e 30q ${OUTPUT_DIR}/churn.txt
sort -r -n -t$'\t' -k2 ${OUTPUT_DIR}/churn.txt -o ${OUTPUT_DIR}/churn.txt
echo "set term canvas size 1200, 800; set output '${OUTPUT_DIR}/output.html'; 
unset key; set bmargin at screen 0.4; set xtics rotate by -90 scale 0,0; 
set ylabel 'Number of lines changed (total)'; set title 'Files with high churn 
level';set boxwidth 0.7; set style fill solid; set noborder; 
plot '${OUTPUT_DIR}/churn.txt' using 2:xticlabels(1) with boxes" | gnuplot
echo "finished running churn.sh!"

This is the code inside churn.sh. The bold command takes 40 or so secs to implement. If in a main script I put ampersand after churn.sh &, it throws an error saying sed can't read churn.txt file (since it's not created yet). It seems that it doesn't wait till the output is saved in a file. I inserted wait after that command but it doesn't help.

Bdar
  • 733
  • 1
  • 5
  • 11

1 Answers1

133

Using the & to run it in the background will do the trick

cppcheck.sh &
churn.sh &
run.sh &

wait
echo "All 3 complete"

It will fork off a new process for each of them.

The bash wait will also come in handy as stated in the comments, if you have something to be run on the parent script, after these three finish.

Without an argument it will wait for all child processes to complete, and then resume execution of the parent script.


The issues you are facing seem to be directly related to this. Variables set are only visible to the sub-shell in which they are defined. So, if you have OUT_DIR specified in the parent script, it won't be visible to the child script when it forks off. The right thing to do in this case would be to export the variable as an environment variable.

Community
  • 1
  • 1
Anirudh Ramanathan
  • 46,179
  • 22
  • 132
  • 191
  • 19
    Additionally, a call to 'wait' will ensure the parent script doesn't exit until all the background jobs have completed –  Mar 26 '13 at 18:46
  • @DarkCthulhu I've already used that. The problem is the following. One line gets the data from git using --numstat and outputs it to some file. That file is then used in the next line, but since & command start executing the line and skips to the next line, the following line throws an error saying 'file doesn't exist', since it takes a while to get the data and save it in a txt file. It's almost similar issue in all the script files. – Bdar Mar 26 '13 at 18:59
  • 1
    @Bdar You just said the scripts are independent of each other. If you want to add a little time between their executions, you can try `sleep n && churn.sh &`. Else, you are better off executing them in sequence. – Anirudh Ramanathan Mar 26 '13 at 19:01
  • @DarkCthulhu Yes, the scripts are independent of each other. But inside of cppcheck there is a line that creates an output(that takes about 20secs to do that) and the following line is doing a work with that output, but it seems that it skips to the next line not waiting till output has been created. It's inside the script. So, does this mean & is also active inside the script as well? – Bdar Mar 26 '13 at 19:12
  • @Bdar Could you post the relevant part where you run into this issue? & shouldnt affect execution of individual scripts themselves. – Anirudh Ramanathan Mar 26 '13 at 20:04
  • @DarkCthulhu I added additional lines in my question. Can you take a look please? Thank you for your help. – Bdar Mar 26 '13 at 20:52
  • 1
    @Bdar I don't have any such issues. I just tested locally with something similar, even with a delay. If I had to guess, I'd guess that your `${OUTPUT_DIR}` is not getting set in the child-script. Are you doing `export OUTPUT_DIR=` in your parent script? Can you check with a hardcoded OUTPUT_DIR once? – Anirudh Ramanathan Mar 26 '13 at 21:13
  • @DarkCthulhu the issue is solved, thank you! – Bdar Mar 27 '13 at 17:11
  • what if one of the scripts failed,then another two no need to run,it would be better to use parallel. – 52coder Nov 17 '18 at 02:52