2

I'm looping over a large file, on each line I'm running some commands, when they finish I want the entire output to be appended to a file.

Since there's nothing stopping me from running multiple commands at once, I tried to run this in the background &.

It doesn't work as expected, it just appends the commands to the file as they finish, but not in the order they appear in the subshell

#!/bin/bash
while read -r line; do
  (
    echo -e "$line\n-----------------"
    trivy image --severity CRITICAL $line
    # or any other command that might take 1-2 seconds
    echo "============="
  ) >> vulnerabilities.txt &
done <images.txt

Where am I wrong?

Moshe
  • 4,635
  • 6
  • 32
  • 57
  • 3
    there's no guarantee jobs will complete in the same order in which they were started; in turn there's no guarantee on the order of data written to the (single) output file; if you want the outputs ordered you'll need to code for that (eg, have each background job write to a separate log file and then have the main script `wait` until all background jobs complete and then consolidate the various output files into a single output file in the desired order; another option, if the outputs can be prefixed with a background job number would be to `sort` the (single) output file afterwards) – markp-fuso Dec 16 '21 at 16:57
  • 2
    All the background processes are running concurrently, and they're all writing to the file as they go. So all the output gets mixed together. – Barmar Dec 16 '21 at 17:30

1 Answers1

3

Consider using GNU Parallel to get lots of things done in parallel. In your case:

parallel -k -a images.txt trivy image --severity CRITICAL > vulnerabilities.txt

The -k keeps the output in order. Add --bar or --eta for progress reports. Add --dry-run to see what it would do without actually doing anything. Add -j ... to control the number of parallel jobs at any one time - by default, it will run one job per CPU core at a time - so it will basically keep all your cores busy till the jobs are done.

If you want to do more processing on each line, you can declare a bash function and call that with each line as its parameter... see here.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432