What you're trying to achieve is not possible with a guarantee. First thing to note is that your code could only possibly order based on the time that the data was received, not when it was produced, which is not quite the same. The only way to guarantee this would be to do something on the source which will add some guaranteed ordering between the two systems.
The below code should make it "more likely" to be correct by removing the threads. Assuming that you're using MRI, the threads are "green" so technically can't be running at the same time. That means you're beholden upon the scheduler choosing to run your thread at the "right" time.
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
for_reading = [stdout, stderr]
until(for_reading.empty?) do
wait_timeout = 1
# IO.select blocks until one of the streams is has something to read
# or the wait timeout is reached
readable, _writable, errors = IO.select(for_reading, [], [], wait_timeout)
# readable is nil in the case of a timeout - loop back again
if readable.nil?
Thread.pass
else
# In the case that both streams are readable (and thus have content)
# read from each of them. In this case, we cannot guarantee any order
# because we recieve the items at essentially the same time.
# We can still ensure that we don't mix data incorrectly.
readable.each do |stream|
buffer = ''
# loop through reading data until there is an EOF (value is nil)
# or there is no more data to read (value is empty)
while(true) do
tmp = stream.read_nonblock(4096, buffer, exception: false)
if tmp.nil?
# stream is EOF - nothing more to read on that one..
for_reading -= [stream]
break
elsif tmp.empty? || tmp == :wait_readable
# nothing more to read right now...
# continue on to process the buffer into lines and log them
break
end
end
if stream == stdout
buffer.split("\n").each { |line| logger.info(format(:info, line)) }
elsif stream == stderr
buffer.split("\n").each { |line| logger.info(format(:error, line)) }
end
end
end
end
end
Note that in a system generating a lot of output in a very short period of time there is more likely to be an overlap where things get out of order. This likelihood increases with the amount time taken to read the stream and process it. It would be best to ensure that the absolute minimum processing is done inside the loop. If the formatting (and writing) are expensive, consider moving those items into a separate thread reading from a single queue, and have the code inside the loop only push the buffer (and source identifier) onto the queue.