-2

Given the following small program:

#!/usr/bin/env python3
import time

for i in range(15):
    print(f'{i}: sleeping')
    time.sleep(1)

When I run it with stdout attached to the terminal directly, I get the output pretty much immediately:

./sync_test.py

However, if I run my program with stdout attached to a pipe (which on the other end cat is listening and printing to the terminal), I don't get the output until the program ends:

./sync_test.py | cat

The only way I can get the output to be printed before the program quits is if I add flush=True to all of my print statements.

I tried this out in a few other languages (Go, Java, and C) and got mixed results. In Go and Java, the programs print immediately when stdout is not attached to a TTY. But with C, I experience the same behavior as I do with Python.

The Go program (test_go.go):

package main

import (
    "fmt"
    "time"
)

func main() {
    for i := 0; i < 15; i++ {
        fmt.Printf("%d: sleeping\n", i)
        time.Sleep(1 * time.Second)
    }
}

The Java program (test_java.java):

public class test_java {
    public static void main(String[] args) throws Exception{
        for(int i=0; i<15; i++){
            System.out.println(i + ": sleeping");
            Thread.sleep(1000);
        }
    }
}

The C program (test_c.c):

#include <stdio.h>
#include <unistd.h>
int main() {
    for(int i=0; i < 15; i ++) {
        printf("Hello, World!\n");
        sleep(1);
    }
    return 0;
}

(all programs were run with | cat at the end of the command)

I understand that stdout is buffered. What is the explanation for the differences in behavior across languages? And whether stdout is connected to a TTY or not? Are Java and Go flushing the buffers implicitly whereas C and Python are not?

wheeler
  • 2,823
  • 3
  • 27
  • 43
  • 1
    Python itself doesn't care. Buffering is controlled by the output destination. Direct terminal output is line-buffered. Pipes have much larger buffers, a minimum of 8Kb and often more. – John Gordon Nov 02 '22 at 23:55
  • 1
    Adding to @John Gordon's point, this isn't related to `asyncio` - you'll get the same behavior with a regular program that does, eg, `print("hello"); time.sleep(10)`. – craigb Nov 03 '22 at 00:07
  • I just realized that. I guess my question is then: why does **Python** do this? Are other languages flushing the buffer implicitly when I print? – wheeler Nov 03 '22 at 00:09
  • Did you try any other languages? Most of them should behave the same. – user2357112 Nov 03 '22 at 00:16
  • I updated the question. Golang, Java, and C all print pretty much immediately, regardless of whether or not `stdout` is attached to a terminal. – wheeler Nov 03 '22 at 00:20
  • See [answer](https://stackoverflow.com/a/17615727/) if you simply want to force Python to not buffer. Also, a trivial C program that has a bunch of `printf` with using strings without `\n` with `sleep` in between them will demonstrate the need to call `flush`, so your assertion that C doesn't print right away is proven false. – metatoaster Nov 03 '22 at 00:20
  • I never said that C doesn't print right away. I said that C *does* print right away, regardless of whether or not `stdout` is attached to a TTY. All output in all programs are printing newlines, so your point about lack of `\n` is irrelevant. – wheeler Nov 03 '22 at 00:25
  • So does Python. See [this thread](https://stackoverflow.com/questions/6290117/printf-flush-problem) to clear up your misconceptions. – metatoaster Nov 03 '22 at 00:26
  • 1
    As I said, Python does *not* print right away if `stdout` is not attached to a TTY. Please re-read my question to clear up your misconceptions. – wheeler Nov 03 '22 at 00:27
  • The other answer I linked earlier showed that you can change that behavior, Try `python -u sync_test.py | cat`. Also [this trivial C program](https://pastebin.com/raw/vwdaRtuy) will show that there is always buffering going on even for `stdout`, and seeing both dots only showing up after a second demonstrates that there is at least line-buffering going on for C. – metatoaster Nov 03 '22 at 00:33
  • Please refrain from commenting unless you have read my question and are going to stay on topic. I already understood how to fix it. I am asking why does Python do this? – wheeler Nov 03 '22 at 00:36
  • Well, your edits with the new code examples made it more than amply clear that it is now a different question so it's now reopened. Though honestly asking "why" a language does this usually gets nowhere unless there are specific design documents that can be referenced (which you might be able to find from their tracker/past discussions, but often times this is buried behind years or decades of archives). That said, you have not addressed the usage of `-u` flag with Python. – metatoaster Nov 03 '22 at 01:13
  • That said, for Java, you used [`System.out`](https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#out) which is a [PrintStream](https://docs.oracle.com/javase/7/docs/api/java/io/PrintStream.html), which may have been configured to flush automatically. – metatoaster Nov 03 '22 at 01:23
  • As expected, using `python -u` causes output to be printed immediately. I already knew this, and it is functionally equivalent to adding `flush=True` to the `print` statements, so I don't need to address it in the question. – wheeler Nov 03 '22 at 01:36
  • And the question was erroneously closed as it was anyway. You have a number of incorrect and off-topic statements, so please stop commenting unless you actually have something that is related to the question. – wheeler Nov 03 '22 at 01:43
  • Apologies, as per the question/edits as worded, I failed to gather that you were asking for the difference between the buffering schemes in streams to TTY vs. something else, which ShadowRanger got and provided the answer for. That being said, the previous threads I've linked _do_ demonstrate how outputs may be buffered (even in tty) - if you wanted the output to be dumped to stdout now you _must_ flush the stream, otherwise some form buffering may happen, which I lumped together (incorrectly from your perspective) as the goal, rather than addressing line-buffer vs block-buffer as you wanted. – metatoaster Nov 03 '22 at 03:23
  • *Are Java and Go flushing the buffers implicitly*? Go's os.Stdout is not buffered. – Charlie Tumahai Nov 03 '22 at 03:31

1 Answers1

1

The answer to "Why does Python do this?" is "To be roughly consistent with the C standard", where, by default, being connected to a tty makes stdout line buffered, and block buffered otherwise. Back in the Python 2 days, sys.stdout (and the file objects returned by regular open, but not by io.open) was actually built on top of C's stdio.h calls, so this behavior was inherited from C directly; in Python 3, stdio.h was removed from the picture (Python 3 uses its own buffering wrappers for I/O, which when flushed, call the underlying OS I/O routines directly, with no C-style FILE*s involved at all), but they preserved the behavior to minimize compatibility issues for existing code.

The reason C (and some other languages) make it line-buffered when connected to a tty is to allow the user to see complete lines immediately. Otherwise it's block-buffered because it's assumed no one is reading the lines as they appear anyway, so reducing system calls (by buffering more before each bulk write system call) is beneficial.

System calls aren't that expensive nowadays, so it's perfectly reasonable for a modern language to default to line buffering even when not connected to a tty; it'll run slower, but a few thousand cycles here and there isn't that much in the grand scheme of things. Go and Java aren't wrong to behave the way they do, and Python and C aren't wrong to go the other way. The behavior is consistent, and it works well enough for common cases, and it's trivial to add flushes (or otherwise mess with the default buffering, e.g. with the -u command line switch for Python, or the setvbuf function in C), and changing it might change the behavior of existing code, so there's no strong incentive to go back and alter the default behavior.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271