I followed the sample code to create a gnu parallel job queue as below
// create a job queue file
touch jobqueue
//start the job queue
tail -f jobqueue | parallel -u php worker.php
// in another shell, add the data
while read LINE; do echo $LINE >> jobqueue; done < input_data_file.txt
This approach does work and handles the job as a simple job queue. But there are two problems
1- reading data from input file and then writing it to the jobqueue (another file) is slow as it involves disk I/O.
2- if for some reason my job aborts in the middle, and I restart the parallel processing, it will re-run all the jobs in the jobqueue file
I can add a script in worker.php to actually remove the line from jobqueue when the job is done, but I feel there is a better way to this.
Is it possible that instead of using
tail -f jobqueue
I can use a named pipe as input to parallel and my current setup can still work as a simple queue?
I guess that way I won't have to remove the lines from pipe which are done as that will be automatically removed on read?
P.S. I know and I have used RabbitMQ, ZeroMQ (and I love it), nng, nanomsg, and even php pcntl_fork as well as pthreads. So it is not a question of what is there for parallel processing. It is more of a question to create a working queue with gnu parallel.