I have a job that reads data from a \n delimited stream and sends the information to xargs to process 1 line at a time. The problem is, this is not performant enough, but I know that if I altered the program such that the command executed by xargs was sent multiple lines instead of just one line at a time, it could drastically improve the performance of my script.
Is there a way to do this? I haven't been having any luck with various combinations of -L
or -n
. Unfortunately, I think I'm also stuck with -I
to parameterize the input since my command doesn't seem to want to take stdin if I don't use -I
.
The basic idea is that I'm trying to simulate mini-batch processing using xargs.
Conceptually, here's something similar to what I currently have written
contiguous-stream | xargs -d '\n' -n 10 -L 10 -I {} bash -c 'process_line {}'
^ in the above, process_line
is easy to change so that it could process many lines at once, and this function right now is the bottleneck. For emphasis, above, -n 10
and -L 10
don't seem to do anything, my lines are still processing one at a time.