286
  1. How often does Python flush to a file?
  2. How often does Python flush to stdout?

I'm unsure about (1).

As for (2), I believe Python flushes to stdout after every new line. But, if you overload stdout to be to a file, does it flush as often?

Michael Currie
  • 13,721
  • 9
  • 42
  • 58
Tim McJilton
  • 6,884
  • 6
  • 25
  • 27

5 Answers5

405

For file operations, Python uses the operating system's default buffering unless you configure it do otherwise. You can specify a buffer size, unbuffered, or line buffered.

For example, the open function takes a buffer size argument.

http://docs.python.org/library/functions.html#open

"The optional buffering argument specifies the file’s desired buffer size:"

  • 0 means unbuffered,
  • 1 means line buffered,
  • any other positive value means use a buffer of (approximately) that size.
  • A negative buffering means to use the system default, which is usually line buffered for tty devices and fully buffered for other files.
  • If omitted, the system default is used.

code:

bufsize = 0
f = open('file.txt', 'w', buffering=bufsize)
zyxue
  • 7,904
  • 5
  • 48
  • 74
Corey Goldberg
  • 59,062
  • 28
  • 129
  • 143
  • 34
    +1 for the "line buffered" part. That's exactly what I was looking for and it works like a charm. – rein Mar 06 '13 at 21:33
  • 3
    Using Python 3.4.3 when I do `open('file.txt', 'w', 1)` I get proper line buffering. But if I do anything larger (I wanted `open('file.txt', 'w', 512)`) it buffers the full `io.DEFAULT_BUFFER_SIZE` of 8192. Is that a Python bug, a Linux bug, or an ID10t bug? – Bruno Bronosky Dec 01 '17 at 17:00
  • Is it possible to change the buffering for the _already opened_ streams? Say, I want `stdout` to be line-buffered regardless of whether it is a console or redirected to a file? – Mikhail T. Sep 23 '18 at 03:36
  • what I am confused is what the term `flushing` even means. Why do we need it? What is it for? why should I care about it? – Charlie Parker Mar 04 '19 at 19:44
  • 2
    @CharlieParker when you call `write()` on a file handle, the output is buffered in memory and accumulated until the buffer is full... at which time the buffer gets "flushed" (content is written from the buffer to the file). You can explicitly flush the buffer by calling the `flush()` method on a file handle. – Corey Goldberg Mar 05 '19 at 04:59
  • 6
    Note that unbuffered (0) is only available in binary mode and line buffered (1) is only available in text mode. – ZaydH May 08 '19 at 07:23
  • It doesn't answer the question about flush. Also "operating system's default buffering" is misleading. This has nothing to do with OS internal buffering. This is solely about the buffering done on userland, in this case by Python standard library. Old doc quoted "system default" (now outdated) probably mean python implementation's default. Docs has been improved since them to clarify how buffer size is determined based on OS block size when available with fallback to implementation default. – Pedro Pedruzzi Apr 28 '20 at 20:50
201

You can also force flush the buffer to a file programmatically with the flush() method.

with open('out.log', 'w+') as f:
    f.write('output is ')
    # some work
    s = 'OK.'
    f.write(s)
    f.write('\n')
    f.flush()
    # some other work
    f.write('done\n')
    f.flush()

I have found this useful when tailing an output file with tail -f.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
kortina
  • 5,821
  • 4
  • 24
  • 28
  • 75
    From the docs: `Note: flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.` – bobismijnnaam Oct 07 '15 at 08:08
  • 2
    @bobismijnnaam next time link to said docs. Only reference I can find is from https://github.com/jprzywoski/python-reference/blob/master/source/docs/file/flush.rst and I don't know who that is. – Bruno Bronosky Nov 30 '17 at 21:33
  • 8
    @Bruno Bronosky Good point. [Docs:](https://docs.python.org/2/library/stdtypes.html#file.flush) `Note: flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.` – bobismijnnaam Nov 30 '17 at 23:08
  • 4
    what I am confused is what the term `flushing` even means. Why do we need it? What is it for? why should I care about it? – Charlie Parker Mar 04 '19 at 19:45
  • 2
    @CharlieParker when you write, you write to a copy of (part of) the file in RAM, which might not be saved to disk for a while. It improves performance, but can mean data loss if that copy never gets written (disk removed, OS crashes, etc). flush() tells Python to immediately write that buffer back to disk. (Then, os.fsync() tells the OS to also do it. There are many layers of buffers...) – Rena Jan 11 '20 at 20:57
  • @bobismijnnaam This might have been fixed in Python 3. I don't see any such note in the [new documentation](https://docs.python.org/3/library/io.html#io.IOBase.flush). – Jeyekomon Aug 22 '22 at 08:05
  • 1
    @Jeyekomon There's nothing to fix - `flush` flushes the user-space buffers and `os.fsync` returns only when the OS is told that the file is persisted (it still may not be physically written depending on the filesystem - eg. nfs, or the storage-hardware's own caches etc - but as long as they say they will persist it, it should be considered fine because the manufacturer is taking the responsibility). As for the documentation, check the os.fsync docs in python: https://docs.python.org/3/library/os.html#os.fsync – ustulation May 21 '23 at 17:04
15

You can also check the default buffer size by calling the read only DEFAULT_BUFFER_SIZE attribute from io module.

import io
print (io.DEFAULT_BUFFER_SIZE)
N Randhawa
  • 8,773
  • 3
  • 43
  • 47
  • 1
    Thanks! It's good to know *that* python sets it as OS defines... but this helps find out *what* the OS pre-defines. – Cometsong Mar 28 '18 at 13:43
14

I don't know if this applies to python as well, but I think it depends on the operating system that you are running.

On Linux for example, output to terminal flushes the buffer on a newline, whereas for output to files it only flushes when the buffer is full (by default). This is because it is more efficient to flush the buffer fewer times, and the user is less likely to notice if the output is not flushed on a newline in a file.

You might be able to auto-flush the output if that is what you need.

EDIT: I think you would auto-flush in python this way (based from here)

#0 means there is no buffer, so all output
#will be auto-flushed
fsock = open('out.log', 'w', 0)
sys.stdout = fsock
#do whatever
fsock.close()
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
KLee1
  • 6,080
  • 4
  • 30
  • 41
2

Here is another approach, up to the OP to choose which one he prefers.

When including the code below in the __init__.py file before any other code, messages printed with print and any errors will no longer be logged to Ableton's Log.txt but to separate files on your disk:

import sys

path = "/Users/#username#"

errorLog = open(path + "/stderr.txt", "w", 1)
errorLog.write("---Starting Error Log---\n")
sys.stderr = errorLog
stdoutLog = open(path + "/stdout.txt", "w", 1)
stdoutLog.write("---Starting Standard Out Log---\n")
sys.stdout = stdoutLog

(for Mac, change #username# to the name of your user folder. On Windows the path to your user folder will have a different format)

When you open the files in a text editor that refreshes its content when the file on disk is changed (example for Mac: TextEdit does not but TextWrangler does), you will see the logs being updated in real-time.

Credits: this code was copied mostly from the liveAPI control surface scripts by Nathan Ramella

Mattijs
  • 1,909
  • 3
  • 19
  • 28