7

What is the correct way to multithread independent if statements in a bash script? Is it best to place the & after code contained in the if or after the expression?

For an & after the expression, it makes sense to continue threading as necessary if the if contains a large block of code. But should one line of code also end with &?

After the expression:

if [ expression ] &
  then
    #task
fi

After the task:

if [ expression ]
  then
    #task &
fi


Imagine 3 if statements that all perform tasks independent of each other, how does the execution work with the different placement of the &? From what I understand, if placed after the expression, all 3 expressions start (basically) simultaneously and so do the 3 tasks.

#Thread 1         #Thread 2         #Thread 3
if [ expr1 ] &    if [ expr2 ] &    if [ expr3 ] &
  then              then              then
    task1            task2            task3
fi                fi                fi
wait

If placed after the task code, the first if would be evaluated and only as the first task begins would the 2nd if be evaluated. The tasks are more staggered than simultaneous.

#Thread 1         #Thread 2         #Thread 3
if [ expr1 ]
  then
    task1 &       if [ expr2 ]
fi                  then
                      task2 &       if [ expr3 ]
                  fi                  then
                                        task3 &
                                    fi
wait

The expressions cannot be combined to do threading inside the if such as:

if [ combined expression ]
  then
    #task1 &
    #task2 &
    #task3 &
fi
Matt
  • 1,792
  • 5
  • 21
  • 33

4 Answers4

6

If you want each if condition to execute within the context of its respective "thread" (actually subshell process), then I think the thing to do is put the & after the closing fi statement for each if. Then the evaluation of each if expression, along with conditional code wiil occur entirely within the context of its own "thread".

For example:

#/bin/bash

if [ 1 ] 
  then
    for i1 in {1..3}; do echo $i1; sleep 1; done
fi &
if [ 1 ]
  then
    for i2 in {a..c}; do echo $i2; sleep 1; done
fi &
wait

Output from each "thread" is interleaved as expected:

1
a
2
b
3
c

Note in all cases with &, these are actually processes (created with fork()) and not threads (created with pthread_create()). See Multithreading in Bash. You can test this by creating a variable, e.g. n=0 before the "threads" are started. Then in one thread increment n and echo $n in all threads. You'll see each "thread" gets its own copy of n - n will have different values in the incrementing and non-incrementing threads. fork() creates a new process copy (including independent copies of variables); pthread_create() doesn't.

Community
  • 1
  • 1
Digital Trauma
  • 15,475
  • 3
  • 51
  • 83
  • In my specific case, I think it is fine that they are separate processes. Does `wait` just end child processes when they finish then? – Matt Oct 18 '13 at 20:13
  • 1
    Pretty much, yes. `wait` blocks until all child sub-processes have completed. So if you have any more statements after the `wait`, then these statements won't execute until all the sub-processes have completed. `wait` can also take a pid or job id if you want to wait for a specific subprocess to end. You can get the pid of subprocesses from the built-in `$!` variable right after backgrounding with `&`. – Digital Trauma Oct 18 '13 at 20:43
5

Placing the & after the if will only background the evaluation of the conditional expression ; [ is actually an alias for the test command, and that is what will be backgrouded, and not the task.

Placing the & after the task will background the task. The degree to which the tasks are staggered rather than simultaneous depends on the relative time needed to evaluate expr1 compared to the time needed to perform task1

See the following script:

#!/bin/bash

if [ 1 ] &
then
    sleep 10 
fi
if [ 1 ] &
then
    sleep 10 
fi
if [ 1 ] &
then
    sleep 10 
fi
wait

it takes 30 seconds to run

$ time . test.sh 
[3]   Done                    [ 1 ]
[3]   Done                    [ 1 ]
[3]   Done                    [ 1 ]

real    0m30.015s
user    0m0.003s
sys 0m0.012s

and you see that the backgrounded task is [ 1 ], not sleep. Note that is is meaningless as explained by @chepner.

Now the other case:

#!/bin/bash

if [ 1 ] 
then
    sleep 10 &
fi
if [ 1 ] 
then
    sleep 10 &
fi
if [ 1 ] 
then
    sleep 10 &
fi
wait

gives:

[2]   Done                    sleep 10
[3]   Done                    sleep 10
[4]-  Done                    sleep 10

real    0m10.197s
user    0m0.003s
sys 0m0.014s

It only takes 10 seconds ; all the sleeps are simultaneous as they are being backgrounded.

Last option as described by @DigitalTrauma:

#!/bin/bash

if [ 1 ] 
then
    sleep 10 
fi &
if [ 1 ]
then
    sleep 10 
fi &
if [ 1 ] 
then
    sleep 10
fi &
wait

Then the whole if statement is backgrounded, i.e. both the evaluation of exprN and of taskN

[3]   Done                    if [ 1 ]; then
    sleep 10;
fi
[4]   Done                    if [ 1 ]; then
    sleep 10;
fi
[5]   Done                    if [ 1 ]; then
    sleep 10;
fi

real    0m10.017s
user    0m0.003s
sys 0m0.013s

giving the expected 10 seconds running time

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
4

Compare

if false
then
  echo "condition true"
fi

with

if false &
then
  echo "condition ... true?"
fi

A command terminated by & exits immediately with status 0, so the if branch is taken regardless of what the background process eventually returns. It doesn't make sense to put a command that unconditionally succeeds in the command list of an if statement. You have to let it complete so the body can be correctly (conditionally) executed.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • I wondered if that would happen. I thought `&` proceeds with the next line of code, but thought the next line read was after the `fi`. I think DigitalTrauma's answer still allows the if to be evaluated. – Matt Oct 18 '13 at 20:09
  • Yes, his answer is correct. This is just an explanation of why you can't background the condition only. – chepner Oct 18 '13 at 20:16
0

the answers so far rely on using spawning 3 and waiting for all 3 to complete before starting again.

a "better" way in bash is using the jobs command like so

jobs|wc -l and then looping with a small sleep command like so

(not tested code)

 while [  $COUNTER -lt 10 ]; do
if [[$(jobs | wc-l`) < 3 ]] then 

    expression &
    fi
    sleep 10;

done

that will allow all 3 "threads" to be filled without waiting for all 3 to complete before refilling

exussum
  • 18,275
  • 8
  • 32
  • 65