12

I am trying to process so text files with awk using the parallel command as a shell script, but haven't been able to get it to output each job to a different file

If i try:

seq 10 | parallel awk \''{ if ( $5 > 0.4 ) print $2}'\' file{}.txt > file{}.out

It outputs to the file file{}.out instead of file1.out, file2.out, etc.

The tutorial and man pages also suggest that I could use --files, but it just prints to stdout:

seq 10 | parallel awk \''{ if ( $5 > 0.4 ) print $2}'\' file{}.txt --files file{}.out
Scott Ritchie
  • 10,293
  • 3
  • 28
  • 64

3 Answers3

16

It turns out I needed to quote out the redirect, because it was being processed outside of parallel:

seq 10 | parallel awk \''{...}'\' file{}.txt ">" file{}.out
Scott Ritchie
  • 10,293
  • 3
  • 28
  • 64
6

Another way is to introduce the entire parallel command inside double quotes:

seq 10 | parallel " awk command > file{}.out "

Although, sometimes is useful redirect the output to file and also to stdout. You can achieve that using tee. In this case, the command to be used could be:

seq 10 | parallel " awk command | tee file{}.out "

Bruce_Warrior
  • 1,161
  • 2
  • 14
  • 24
2

--results is made for this:

seq 10 |
  parallel --results file{}.out awk \''{ if ( $5 > 0.4 ) print $2}'\' file{}.txt

It will also generate file*.out.seq (containing the sequence number) and file*.out.err (containing stderr).

Ole Tange
  • 31,768
  • 5
  • 86
  • 104