I have a non-trivial Bash script taking roughly the following form:
# Initialization
<generate_data> | while read line; do
# Run tests and filters on line
if [ "$tests_pass" ]; then
echo "$filtered_line"
fi
done | sort <sort_option> | <consume_data>
# Finalization
Compared to the filter, the generator consumes minimal processing resources, and, of course, the sort operation cannot begin until all filtered data is available. As such, the filter, a cascade of several loops and conditionals written natively in Bash, is the processing bottleneck, and the single process running this loop consumes an entire core.
A useful objective would be to distribute this logic across several child processes that each run separate filter loops, and which, in turn, each consume blocks of lines from the generator, and which each produce output blocks concatenated into the sort operation. Functionality of this kind is available through tools such as GNU Parallel, but using them requires invoking an external command to run in the pipe.
Is any convenient tool or feature available that makes the operations on the script distributable across multiple processes without disrupting the overall structure of the script? I am not aware of a Bash builtin feature, but one surely would be useful.