2

Running the following command does what I want when reading from a file:

parallel --gnu -j2 "echo {} && sleep 5" < myfile.txt

I would like to do something similar with a pipe. Note that I used the following page for inspiration for the pipe reader and writer: http://www.linuxjournal.com/content/using-named-pipes-fifos-bash

Here is my pipe reader file:

#!/bin/bash

pipe=/tmp/testpipe

trap "rm -f $pipe" EXIT

if [[ ! -p $pipe ]]; then
    mkfifo $pipe
fi

while true
do
    parallel --gnu -j2 "echo {} && sleep 5" <$pipe
done

echo "Reader exiting"

And here is my writer:

#!/bin/bash

pipe=/tmp/testpipe

if [[ ! -p $pipe ]]; then
    echo "Reader not running"
    exit 1
fi


if [[ "$1" ]]; then
    echo "$1" >$pipe
else
    echo "Hello from $$" >$pipe
fi

I then run pipeWriter several times, such as

$ ./pipeWriter one
$ ./pipeWriter two
$ ./pipeWriter three
$ ./pipeWriter four
$ ./pipeWriter five
$ ./pipeWriter six
$ ./pipeWriter seven
$ ./pipeWriter eight
$ ./pipeWriter nine
$ ./pipeWriter ten
$ ./pipeWriter ww
$ ./pipeWriter wwdfsdf
$ ./pipeWriter wwdfsdfsddfsd
$ ./pipeWriter testAgain

The shell running pipeReader shows:

$ ./pipeReader 
one
four
eight
ww
testAgain

First, there is a problem of missing data. Second, parallel does not seem to run in parallel when in the pipe. I would like it to run two jobs at a time (or rather, at most two jobs. If it only has one job that's fine and if another it can start it).

Where am I going wrong?

Xu Wang
  • 10,199
  • 6
  • 44
  • 78

1 Answers1

4

I cannot explain the missing data, and when I run it I do not get missing data. I can, however, explain why you do not see the jobs run in parallel: Your while-loop in the reader may pass only one argument to parallel.

Instead of a pipe use a file as described in http://www.gnu.org/software/parallel/man.html#example__gnu_parallel_as_queue_system_batch_manager

Maybe you can also use cat:

while true
do
    cat $pipe
done | parallel --gnu -j2 "echo {} && sleep 5"

If you have not already done so, walk through the tutorial; your command line will love you for it: http://www.gnu.org/software/parallel/parallel_tutorial.html

Ole Tange
  • 31,768
  • 5
  • 86
  • 104
  • Thanks for the good solution. Your `cat` command does work. I wonder if it is really inefficient because I guess the loop is going very fast and is not triggered only when $pipe changes. Actually I don't think I understand it. I thought that it would re`cat` on each loop so parallel would be passed duplicate arguments? Is that true? Does parallel ignore the duplicate arguments? Thanks for the links, I am working through them. The `tail -f` solution seems nice. – Xu Wang Oct 27 '13 at 20:49
  • Efficiencywise you should be fine. The `cat` will not duplicate output and it will block for input (i.e. no busy wait). GNU Parallel does not ignore duplicated input. – Ole Tange Oct 28 '13 at 11:43
  • Using your cat example, I only get output on the 3rd attempt. For example: `echo "test" > $pipe #nothing happens echo "test" > $pipe #nothing happens echo "test" > $pipe #parallel echoes "test"` – Jake Aug 25 '15 at 16:06
  • http://www.gnu.org/software/parallel/man.html#EXAMPLE:-GNU-Parallel-as-queue-system-batch-manager says There is a a small issue when using GNU parallel as queue system/batch manager: You have to submit JobSlot number of jobs before they will start, and after that you can submit one at a time, and job will start immediately if free slots are available. Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). – Ole Tange Jun 02 '16 at 18:56