6

If I run

$#/bin/bash
for i in `seq 5`; do
    exec 3> >(sed -e "s/^/$i: /"; echo "$i-")
    echo foo >&3
    echo bar >&3
    exec 3>&-
done

then the result is not synchronous; it could be something like:

1: foo
1: bar
2: foo
2: bar
1-
3: foo
3: bar
2-
3-
4: foo
5: foo
4: bar
5: bar
4-
5-

How do I ensure that the process substitution >(...) is completed before proceeding to the next iteration?

Inserting sleep 0.1 after exec 3>&- helped, but it's inelegant, inefficient, and not guaranteed to always work.

EDIT: The example may look silly, but it was for illustration only. What I'm doing is reading a stream of input in a loop, feeding each line to a process which occasionally changes during the loop. Easier explained in code:

# again, simplified for illustration
while IFS= read line; do
    case $line in
    @*)
        exec 3>&-
        filename=${line:1}
        echo "starting $filename"
        exec 3> >(sort >"$filename"; echo "finished $filename")
        ;;
    *)
        echo "$line" >&3
        ;;
    esac
done
exec 3>&-
musiphil
  • 3,837
  • 2
  • 20
  • 26
  • 1
    Not related to the question, but seq is bad form (not available on all platforms bash is supported, relatively inefficient); consider a C-style for loop instead: `for ((i=0; i<5; i++))` – Charles Duffy Jun 21 '12 at 01:29
  • As for the immediate question, the best answer is "Don't Do That". If you can explan what end you're trying to accomplish, we can provide a better way to meet that goal. – Charles Duffy Jun 21 '12 at 01:37
  • There's [a related FAQ item](http://mywiki.wooledge.org/BashFAQ/106). – musiphil Jun 27 '12 at 08:19

6 Answers6

7

The following works in bash 4, using coprocesses:

#!/bin/bash
fd_re='^[0-9]+$'
cleanup_and_wait() {
    if [[ ${COPROC[1]} =~ $fd_re ]] ; then
        eval "exec ${COPROC[1]}<&-"
        echo "waiting for $filename to finish" >&2
        wait $COPROC_PID
    fi
}

while IFS= read -r line; do
    case $line in
    @*)
        cleanup_and_wait
        filename=${line:1}
        echo "starting $filename" >&2
        coproc { sort >"$filename"; echo "Finished with $filename" >&2; }
        ;;
    *)
        printf '%s\n' "$line" >&${COPROC[1]}
        ;;
    esac
done
cleanup_and_wait

For prior versions of bash, a named pipe can be used instead:

cleanup_and_wait() {
    if [[ $child_pid ]] ; then
      exec 4<&-
      echo "waiting for $filename to finish" >&2
      wait $child_pid
    fi
}

# this is a bit racy; without a force option to mkfifo,
# however, the race is unavoidable
fifo_name=$(mktemp -u -t fifo.XXXXXX)
if ! mkfifo "$fifo_name" ; then
  echo "Someone else may have created our temporary FIFO before we did!" >&2
  echo "This can indicate an attempt to exploit a race condition as a" >&2
  echo "security vulnarability and should always be tested for." >&2
  exit 1
fi

# ensure that we clean up even on unexpected exits
trap 'rm -f "$fifo_name"' EXIT

while IFS= read -r line; do
    case $line in
    @*)
        cleanup_and_wait
        filename=${line:1}
        echo "starting $filename" >&2
        { sort >"$filename"; echo "finished with $filename" >&2; } <"$fifo_name" &
        child_pid=$!
        exec 4>"$fifo_name"
        ;;
    *)
        printf '%s\n' "$line" >&4
        ;;
    esac
done
cleanup_and_wait

A few notes:

  • It's safer to use printf '%s\n' "$line" than echo "$line"; if a line contains only -e, for instance, some versions of echo will do nothing with it.
  • Using an EXIT trap for cleanup ensures that an unexpected SIGTERM or other error won't leave the stale fifo sitting around.
  • If your platform provides a way to create a FIFO with an unknown name in a single, atomic operation, use it; this would avoid the condition that requires us to always test whether the mkfifo is successful.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Don't know why but if you use directly > fifo, without the use of file descriptor, 3 for example, the sub-process will stop at the first command printf. And the aim in my answer was to avoid to change the code at the maximum. – Nahuel Fouilleul Jun 21 '12 at 15:03
  • @NahuelFouilleul Good catch, thank you -- the way I had it written was trying to open the input side of the FIFO multiple times. Tested the coprocess version thoroughly; I should have done the same with the FIFO version. – Charles Duffy Jun 21 '12 at 15:43
  • also you need to close the file descriptor 4>&- – Nahuel Fouilleul Jun 21 '12 at 16:00
  • @NahuelFouilleul I already do that as part of the `cleanup_and_wait` function. – Charles Duffy Jun 21 '12 at 19:17
  • @CharlesDuffy: Thanks for the answer using `coproc`; this is the first use of coprocesses I found out. :-) – musiphil Jun 21 '12 at 19:37
  • 1
    One caveat: if you do `some_cmd | while IFS= read line; do ... done`, the `while` loop runs in a subshell and the coprocess it started will not be recognized after the loop. You should write `while IFS= read line; do ... done < <(some_cmd)`; a common trick, and another use of process substitution! – musiphil Jun 21 '12 at 19:40
5

Easy, just pipe everything into cat.

#!/bin/bash
for i in `seq 5`; do
  {
  exec 3> >(sed -e "s/^/$i: /"; echo "$i-")
  echo foo >&3
  echo bar >&3
  exec 3<&-
  }|cat
done

Here's the output:

1: foo
1: bar
1-
2: foo
2: bar
2-
3: foo
3: bar
3-
4: foo
4: bar
4-
5: foo
5: bar
5-
ruief
  • 436
  • 6
  • 9
  • This is a neat trick! However, in my second example (under "EDIT"), opening a file descriptor, writing to it, and closing it are not sequentially placed in the code to allow wrapping in a list, which makes it hard to apply your method. – musiphil Jun 12 '15 at 19:15
  • I tried your "EDIT" code, with the input of 100 filename lines (@00-@99) each one followed by 100 lines of the corresponding number followed by a counter. The output was clean (file 23 containing 100 lines "23-00" through "23-99" in order, etc.) under bash 3.2. – ruief Jun 12 '15 at 22:11
  • What about stdout: was "finished n" always followed by "starting n+1"? Were you able to apply your method to the code? – musiphil Jun 13 '15 at 00:04
  • I wasn't able to apply that method to ordering the "finished" output. You may find this answer helpful: https://unix.stackexchange.com/a/388530/32078 – ruief Jul 11 '18 at 20:34
3
mkfifo tmpfifo
for i in `seq 5`; do
  { sed -e "s/^/$i: /"; echo "$i-";} <tmpfifo &
  PID=$!
  exec 3> tmpfifo
  echo foo >&3
  echo bar >&3
  exec 3>&-
  wait $PID
done
rm tmpfifo
Nahuel Fouilleul
  • 18,726
  • 2
  • 31
  • 36
  • This is a good approach; I've adopted and improved it with some robustness fixes as a second ("if you _aren't_ using bash 4...") branch to my answer. – Charles Duffy Jun 21 '12 at 14:53
1

The "obvious" answer is to get rid of the process substitution.

for i in `seq 5`; do
    echo foo | sed -e "s/^/$i: /"; echo "$i-"
    echo bar | sed -e "s/^/$i: /"; echo "$i-"
done

So the question becomes, do you really need to structure your code using process substitution? The above is much simpler than trying to synchronize an asynchronous construct.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • Well -- using _two_ `sed; echo` constructs instead of something like `{ echo foo; echo bar; } | {sed; echo; }` or `{sed; echo;} < <(echo foo; echo bar)` (if one wants to be able to set variables during processing) isn't exactly being faithful to some of the efficiencies offered by the original form... but yes, the question at hand is indeed WHY?! – Charles Duffy Jun 21 '12 at 01:35
  • Good point, it is just a single `sed` process in the original that receives its input in two stages. – chepner Jun 21 '12 at 01:40
  • Yes, `sed` in my example was just for illustration, and I cannot replace the process substitution simply as state above. I need outputs across iterations in a loop to be fed into a single process substitution. See my EDIT in the article above. – musiphil Jun 21 '12 at 02:15
  • As an aside -- `seq` is a nonstandard command (not found in POSIX). If writing for bash, it's safer to use a C-style for loop instead: `for ((i=0; i<5; i++))` – Charles Duffy Mar 27 '13 at 13:19
1

Another user asks the same question, and receives an exhaustive answer here.

ruief
  • 436
  • 6
  • 9
  • Please don't post answers pointing to duplicates. Instead, leave a comment to point out the duplicate, ideally following the standard template (though it has changed in recent years; it used to be "Possible duplicate of XXX"; but now it's been changed to "Does this answer your question? XXX") – tripleee Mar 01 '22 at 05:09
0

It appears that you are required minimally to close the descriptor, and not sure whether it is required to wait pid. In the demo that follows, the sort does never fail when you don't wait $pidout but fail when you don't exec >&-

#!/bin/bash
seq 100000 |
(
while read;do
  # randomly fork an exec'd file descriptor
  if (( ! (pidout && RANDOM % 100) ));then
    # cleanup previous fork
    ((pidout)) && exec 1>&- && wait $pidout
    # fork &1 to >(subprocess)
    exec 1> >(
       # append my PID to each line
       # output to saved file descriptor
       sh -c 'exec sed "s/$/ $$/"' >&$fdout
    )
    pidout=$!
  fi
  # formatted line to exec'd file descriptor
  printf $'%6s\n' "$REPLY"
done
((pidout)) && exec 1>&- && wait $pidout
) {fdout}>&1 | # save $fdout file descriptor
# check output expected order
sort -k1,1n -c

However, a POSIX way that performs twice better, ensure the correct order, and is script miscible, is provided with few easily readable lines in awk. Used that in exact situations where you would have a process substitution associated to a while loop.

#!/bin/bash
seq 100000 |
awk 'BEGIN{srand()}
! (command && (rand() < 0.99)){
  if (command) close(command)
  command="sed \"s/$/ $$/\""   
}
{
  printf("%6s"ORS, $0)  | command
}' |
sort -k1,1n -c