scanner.Scan() blocking until exit on Linux but not Windows

Question

I'm working on a devtool and one of the features of this tool involves spawning a child process and reading the stdout stream of that process. I need to read each line of output into memory in order to process it in some way (one of the future features of the tool will involve processing logs and sending them to external locations such as log managers and dashboards etc) so that's why I don't simply do cmd.Stdout = os.Stdout)

It works fine, and has done for a while, but only on Windows apparently. I got a rather confusing bug report recently where a user reported the output isn't "real time" and upon testing on Linux, I found that it's true and the output is only dumped to the console when the process exits.

Here is the code that scans reader, works as expected on Windows but not on Linux or in a Linux container on Windows/MacOS (tested both)

https://github.com/Southclaws/sampctl/blob/f639c941dd8f9ca7c7c819a152909044ad63be08/runtime/run.go#L133-L137

And if you poke around the code you'll find where the reader is created with io.Pipe() and bound to the cmd's Stdout/Stderr outputs.

That line 134 is where the program just blocks until the cmd in the goroutine below stops running, on line 161.

I'm assuming this is related to buffers and flushing but I don't know enough about Go's internals to really pinpoint the problem. What exactly is different about scanner.Scan() on Windows and Linux? Why does it block on one platform but not on another? Is it related to threads/goroutines being scheduled differently? (both test machines have multiple cores, even the Docker container had 4 vCPUs)

Here's the issue for reference: https://github.com/Southclaws/sampctl/issues/100

I'm really stumped on this one, would love some help figuring it out!

Edit:

So I messed around some more, still no solution. I attempted to use a Python script and got the same result, stdout works fine when directed to a tty but when it's read by a process it just hangs:

from subprocess import Popen, PIPE
from time import sleep

p = Popen(
    ['/root/.samp/runtime/0.3.7/samp03svr'],
    stdin=PIPE,
    stdout=PIPE,
    stderr=PIPE,
    shell=False,
    cwd="/root/.samp/runtime/0.3.7/")

while True:
    print "attempting to read a line"
    output = p.stdout.read()
    print "read a line"
    if not output:
        print '[No more data]'
        break
    print output

attempting to read a line is where it hangs.

`bufio` does not have target specific code. Have you tried to tee the output using `io.MultiWriter` to detect if it is blocked from cmd. I have a wild guess that is your external program behaves differently on Linux, instead of your go code. — leaf bebop, Jan 29 '18 at 11:40
Just a wild guess - could it be related to the way newlines need to be differently treated. Could you write a small program which just gives scanner the exact same input currently being given to it and run it both on Windows and Linux to see what happens. — Ravi R, Jan 29 '18 at 13:20
@leafbebop I didn't think of that, I'll investigate. The child process appears the same on linux in terms of output though, i.e.: the output is streamed to the console normally through stdout. — Southclaws, Jan 29 '18 at 13:22
@Ravi It is very unlikely, according to the document of `bufio.ScanLines`. But yes, some more controlled test would help. There is still too much mist here. — leaf bebop, Jan 29 '18 at 13:24
For testing purpose, if you replace current implementation with `cmd.Stdout = os.Stdout`, does it work correctly/as expected on Linux? — putu, Jan 30 '18 at 03:14
I've just tested this and yes, it outputs as expected with `cmd.Stdout = os.Stdout`, it seems to be just binding it to io.Pipe() writer that screws it up. — Southclaws, Jan 30 '18 at 17:30
I've now even tried reading byte-by-byte with `.Read(buf)` and it still blocks indefinitely... piping to stdout currently seems like the only fix but that circumvents the entire point of my project... — Southclaws, Jan 30 '18 at 18:00
See https://stackoverflow.com/questions/7876660/how-to-turn-off-buffering-of-stdout-in-c — Charlie Tumahai, Feb 19 '18 at 03:21
@CeriseLimón this didn't apply to this case because that's targeting the child application's output - which I don't have access to. The chosen answer was correct. — Southclaws, Feb 19 '18 at 19:15
The issue is with buffering in the child application. The `stdbuf` command uses a preload hack to call `setvbuf()` in the child application. The answers to the question I linked show how to solve the problem when you can change the child application: call `setvbuf()`, call `fflush()`, etc. — Charlie Tumahai, Feb 19 '18 at 19:56

score 2 · Accepted Answer · answered Feb 15 '18 at 13:02

By default, Linux buffers output when not in interactive mode (i.e. not in a terminal), so the output is only flushed when the buffer is full (e.g. every 4096 bytes, but this is implmentation defined); when the program explicitly calls flush (which is apparently not happening here); or when the process ends (as you are seeing).

You can change this default behaviour by adjusting the buffer size. For example, by launching a program via stdbuf:

stdbuf -oO /root/.samp/runtime/0.3.7/samp03svr

-o for stdout (also -e and -i), O for "Off" (also L for "line buffered" or a size for an explicit buffer size).

Alternatively there is an unbuffer command or the script command:

https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe/61833#61833

score 2 · Answer 2 · answered Feb 19 '18 at 19:18

Going on from the answer from Y_Less, the general solution was to use a pseudo-terminal. I wanted to avoid using stdbuf or unbuffer as that would require depending on external commands being present.

So my end solution for this was https://github.com/kr/pty which is a Go implementation of a pseudo-terminal.

Just thought I'd self-answer to help any other Go users finding this question via search.

scanner.Scan() blocking until exit on Linux but not Windows

2 Answers2