Piped output from Python gets backed up

Question

I have a Python program which prints json objects (from intermittent messages from a game's API). When I run the program by itself, each message immediately appears in the console output.

$ python myprog.py
{"SG_MSG":{"obj_id":"SD748","aspect":"0","time":"42"}}
{"SG_MSG":{"obj_id":"SD748","aspect":"6","time":"75"}}

But when I pipe the output into jq, jq outputs nothing for ages, and sporadically spits out a bunch of messages at once:

$ python myprog.py | jq .

I've tried prepending each message with the record separator character (ascii 30) and using jq --seq . but the result is the same.

Piping to od shows the record separator character is appearing as the jq manual says it should:

$ python myprog.py | od -t d1
… … … 125 125 10 30 123 … …

which is }, }, LF, RS, {

And piping to od shows the same behaviour, where no output is shown until a bunch of messages have backed up.

I'm guessing there's something elementary about bash I'm overlooking here…?

Add `sys.stdout.flush()` calls to your Python script, or use the `-u` argument to the Python interpreter, or set the environment variable `PYTHONUNBUFFERED`. This isn't a bash problem, *or* a jq problem, but [BashFAQ #9](https://mywiki.wooledge.org/BashFAQ/009) provides useful background for understanding it. — Charles Duffy, Mar 22 '19 at 19:25
(Buffering stdout when not to a TTY is default behavior for *the standard C library*, so it impacts *all* programs written in C that don't go out of their way to override it, no matter whether they're invoked from bash or not). — Charles Duffy, Mar 22 '19 at 19:29
Thanks! But why doesn't it buffer when outputting to console, then? — Jack Deeth, Mar 22 '19 at 20:12
@JackDeeth, see, in my prior comment, the caveat "when not to a TTY" -- the console is a TTY (in the sense that the [`isatty()` function](http://pubs.opengroup.org/onlinepubs/009695399/functions/isatty.html) returns true). The assumption is that when something is writing to a TTY, it's being read by a user who wants to see things immediately (and couldn't keep up with the kind of high-speed stream of data that large buffers optimize for anyhow), and thus that latency is more important than throughput. This is why the `unbuffer` command, f/e, operates by *emulating* a TTY. — Charles Duffy, Mar 22 '19 at 22:38

Piped output from Python gets backed up

0 Answers0