Read files from list line by line at different times

Question

I have a tool that outputs timestamped information. I am trying to create a script that invokes multiple instances of the tool and outputs sorted information. The idea is:

for id in "${ids[@]}"
do
  my_tool ${id} > ${id} &
  outputs[${id}]= # first line of ${id} file
done

while [[ $(still_running) ]]
do
  # find the index of minimum timestamp in the outputs array
  echo ${outputs[$index]}
  outputs[${index}]= # read in next line of file ${index}
done

However, I am not sure how to implement the file reading. I know how to read a file line-by-line, but this requires keeping it open. I could open multiple files and keep them open but I don't know how to do that without manually creating variables for them.

I think it is possible to do this if somehow we keep track of which line each file is up to, but I don't know how to read a specific line from a file in bash.

For example, I want to be able to do something like

var1=$(read file1 line1)
var2=$(read file2 line1)
var3=$(read file3 line1)
var1=$(read file1 line2)
var1=$(read file1 line3)
var3=$(read file3 line2)
...

If you want to read several files in parallel, there are not many alternatives to keep them open. Of course you could write a function, which opens a file, reads the nth line (for some n passed as parameter), and then closes the file, but unless the files involved are small, this is not very effective, as you have to scan every time the file from the beginning until you find the n-th line. — user1934428, Aug 03 '21 at 05:32
I'm having trouble understanding your question. It could do with an example IMHO. — Mark Setchell, Aug 03 '21 at 07:33
It is variable, depending on the length of the `ids` array. Most likely between 1 and 10. — base12, Aug 03 '21 at 23:52

score 0 · Accepted Answer · answered Aug 03 '21 at 03:15

0

If we keep track of the line number of each file, we can read a particular line using sed "${NUM}q;d" file as per Bash tool to get nth line from a file.

So for example

outputs[${index}]=$(sed "${line_nos[$id]}q;d" "${id}")
line_nos[$id]=$((line_nos[$id]+1))

I'm sure there's a more elegant solution though.

answered Aug 03 '21 at 03:15

base12

125
1
10

Getting the nth line from a file would be `sed -n 'np'`, e.g. for 2nd line `sed -n '2p'` where `-n` suppresses the normal printing of pattern space, and `p` will print that line. To quit at that point, `sed -n '2p;q'`. – David C. Rankin Aug 03 '21 at 03:52

Mark Setchell · Answer 2 · 2021-08-04T01:01:48.053

I still don't really understand what you are trying to do, but wanted to show you how you can read from multiple files one line at a time.

Just for simplicity, I'll make a dummy process that produces 4 lines of output:

#!/bin/bash
for ((i=0;i<4;i++)) ; do
   echo "Process $1 Line $i"
done

Save that as process and make it executable with:

chmod +x process

and run it with:

./process A

and get:

Process A Line 0
Process A Line 1
Process A Line 2
Process A Line 3

Setup complete... now to the answer. Save the following as go:

#!/bin/bash
while : ; do
   read -u 3 -r pA
   read -u 4 -r pB
   echo $pA
   echo $pB
done 3< <(./process A) 4< <(./process B)

and make it executable:

chmod +x go

Sample Run

./go | more
Process A Line 0
Process B Line 0
Process A Line 1
Process B Line 1
Process A Line 2
Process B Line 2
Process A Line 3
Process B Line 3

Here's a similar thing without a while loop:

#!/bin/bash

{
  # Read and print 2 lines from process A
  read -u 3 -r pA
  echo $pA
  read -u 3 -r pA
  echo $pA
  # Read and print 4 lines from process B
  read -u 4 -r pB
  echo $pB
  read -u 4 -r pB
  echo $pB
  read -u 4 -r pB
  echo $pB
  read -u 4 -r pB
  echo $pB
  # Resume reading last 2 lines from process A
  read -u 3 -r pA
  echo $pA
  read -u 3 -r pA
  echo $pA
} 3< <(./process A) 4< <(./process B)

Notes:

You can get help on read by running help read
You can read about process substitution here

Read files from list line by line at different times

2 Answers2