8

There seems to be a difference in how stdout is buffered on Windows and on Linux when written to console. Consider this small python script:

import time
for i in xrange(10):
    time.sleep(1)
    print "Working" ,

When running this script on Windows we see Workings appearing one after another with a second-long wait in-between. On Linux we have to wait for 10 seconds and then the whole line appears at once.

If we change the last line to print "Working", every line appears individually on Linux as well.

So on Linux, stdout seems to be line-buffered and on Windows not at all. We can switch off the buffering by using the -u-option (in this case the script on Linux has the same behavior as on Windows). The documentation says:

-u Force stdin, stdout and stderr to be totally unbuffered.

So actually, it does not say, that without -u-option stdin and stdout are buffered. And thus my questions:

  1. What is the reason for different behavior on Linux/Windows?
  2. Is there some kind of guarantee, that if redirected to a file, stdout will be buffered, no matter which OS? At least this seems to be the case with Windows and Linux.

My main concern is not (as some answers assume) when the information is flushed, but that if stdout isn't buffered it might be a severe performance hit and one should not rely on it.

Edit: It might be worth noting, that for Python3 the behavior is equal for Linux and Windows (but it is not really surprising, because the behavior is configured explicitly by parameters of the print-method).

ead
  • 32,758
  • 6
  • 90
  • 153
  • 7
    Python 2 uses C stdio, and the Windows CRT defaults to no buffering for stdout when it's a tty (i.e. character device) as opposed to a disk file or pipe. – Eryk Sun Aug 07 '17 at 13:02

5 Answers5

6

Assuming you're talking about CPython (likely), this has to do with the behaviour of the underlying C implementations.

The ISO C standard mentions (C11 7.21.3 Files /3) three modes:

  • unbuffered (characters appear as soon as possible);
  • fully buffered (characters appear when the buffer is full); and
  • line buffered (characters appear on newline output).

There are other triggers that cause the characters to appear (such as buffer filling up even if no newline is output, requesting input under some circumstances, or closing the stream) but they're not important in the context of your question.

What is important is 7.21.3 Files /7 in that same standard:

As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

Note the wiggle room there. Standard output can either be line buffered or unbuffered unless the implementation knows for sure it's not an interactive device.

In this case (the console), it is an interactive device so the implementation is not permitted to use unbuffered. It is, however allowed to select either of the other two modes which is why you're seeing the difference.

Unbuffered output would see the messages appear as soon as you output them (a la your Windows behaviour). Line-buffered would delay until output of a newline character (your Linux behaviour).

If you really want to ensure your messages are flushed regardless of mode, just flush them yourself:

import time, sys
for i in xrange(10):
    time.sleep(1)
    print "Working",
    sys.stdout.flush()
print

In terms of guaranteeing that output will be buffered when redirecting to a file, that would be covered in the quotes from the standard I've already shown. If the stream can be determined to be using a non-interactive device, it will be fully buffered. That's not an absolute guarantee since it doesn't state how that's determined but I'd be surprised if any implementation couldn't figure that out.

In any case, you can test specific implementations just by redirecting the output and monitoring the file to see if it flushes once per output or at the end.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
4

The behavior differs because the buffering is generally unspecified, which means implementations can do whatever they want. And, it means that implementations can change at any time, or vary in undocumented ways, possibly even on the same platform.

For example, if you print a "long enough" string on Linux, with no newline (\n), it will likely be written through as if it had a newline (because it exceeds the buffer). You may also find the buffer size varies between stdout, pipes, and files.

It's really bad to depend on unspecified behavior, so use flush() when you really need the bytes to be written.

And if you need to control buffering (e.g. for performance reasons), then you need to implement your own buffering on top of write() and flush(). It's pretty straightforward to do, and that gives you complete control over how and when bytes are actually written.

payne
  • 13,833
  • 5
  • 42
  • 49
  • Actually, for me it is about the performance: unbuffered stdout would mean a huge performance hit and if I have no guarantee I have to use an additional layer to ensure buffering. – ead Aug 15 '17 at 04:08
  • 2
    Unspecified rather than undefined. Undefined means *anything* can happen, unspecified means one of a limited number of things can happen. – paxdiablo Aug 15 '17 at 04:57
1

Windows and Linux have very different console output drivers. In Linux, the output is being buffered until the \n occurs in the case of your program.

If you want to force the buffer to flush manually use

import sys
sys.stdout.flush()
Lucas Hendren
  • 2,786
  • 2
  • 18
  • 33
  • I don't quite understand, what you mean by "console output drivers". For Python3's `print` the behavior is the same on Windows and Linux: stdout is line-buffered. – ead Aug 12 '17 at 19:27
1

This already has answers elsewhere, but I will summarize below.

  1. The reason for the different behavior on windows versus linux is because of the way the print command is implemented (as noted in the comment by eryksun). You can get more information about that over here and here.

  2. This can be remedied in many ways in python. More on that over here.

AkshayDandekar
  • 421
  • 2
  • 11
1

The issue is known since 2011, see this Python bug issue #11633

The print function does not do any buffering. The file object it writes to may. Even sys.stdout may or may not.

To take in account the differences of behavior, the solution found was to update the documentation in order to include the following sentence in bold:

The file argument must be an object with a write(string) method; if it is not present or None, sys.stdout will be used. Output buffering is determined by file.

It is worth noting that:

Guido said that it is what it should do and that "Apps that need flushing should call flush()." So a code change is rejected.

Fabien
  • 4,862
  • 2
  • 19
  • 33