38

I'm running a test, and found that the file doesn't actually get written until I control-C to abort the program. Can anyone explain why that would happen?

I expected it to write at the same time, so I could read the file in the middle of the process.

import os
from time import sleep

f = open("log.txt", "a+")
i = 0
while True:
  f.write(str(i))
  f.write("\n")
  i += 1
  sleep(0.1)
Marvin K
  • 437
  • 1
  • 5
  • 11

8 Answers8

88

Writing to disk is slow, so many programs store up writes into large chunks which they write all-at-once. This is called buffering, and Python does it automatically when you open a file.

When you write to the file, you're actually writing to a "buffer" in memory. When it fills up, Python will automatically write it to disk. You can tell it "write everything in the buffer to disk now" with

f.flush()

This isn't quite the whole story, because the operating system will probably buffer writes as well. You can tell it to write the buffer of the file with

os.fsync(f.fileno())

Finally, you can tell Python not to buffer a particular file with open(f, "w", 0) or only to keep a 1-line buffer with open(f,"w", 1). Naturally, this will slow down all operations on that file, because writes are slow.

Katriel
  • 120,462
  • 19
  • 136
  • 170
  • 5
    Note that gratuitous `fsync()`s are the bane of people with laptops with spinning-disc drives who care about their battery life. You don't need to call `fsync()` to get content to be visible to other programs; it's closer to being about visibility after an unexpected reboot (though it's not necessarily a sufficient condition there either). – Charles Duffy Mar 22 '12 at 17:18
  • @CharlesDuffy: `You don't need to call fsync() to get content to be visible to other programs`: why? I think that if your program is not finished, and other programs read the file, its content could be not updated. – Marco Sulla Sep 16 '15 at 15:59
  • 2
    @MarcoSulla, `fsync()` syncs from the block cache to the disk, but the block cache is shared with other programs as well. – Charles Duffy Sep 16 '15 at 16:00
  • 2
    @MarcoSulla, ...thus, as long as you've done a successful `write()` syscall -- which `flush()` guarantees -- the content is recognized by the operating system and will be made available to other software. Only exception would be on shared (network) filesystems. – Charles Duffy Sep 16 '15 at 16:02
  • @MarcoSulla, ...I haven't seen any acknowledgement to the above -- do I need to find sources/documentation? – Charles Duffy Sep 18 '15 at 02:28
  • @CharlesDuffy: Nope, just personal experience. I recently wrote a small script that writes on a csv in append mode. Even if I enclose the writes in a `with` statement, sometimes the 1st write is *not* accomplished, only the 2nd one is done. It *seems* that the problem is gone after adding `fsync()`. And this happens in the same script instance! Maybe it's my script that's wrong, can't say for sure. Anyway, if you want to discuss it further, I suppose it's better to start a chat (when I'm at home) :) – Marco Sulla Sep 18 '15 at 13:26
8

You need to f.close() to flush the file write buffer out to the file. Or in your case you might just want to do a f.flush(); os.fsync(); so you can keep looping with the opened file handle.

Don't forget to import os.

deed02392
  • 4,799
  • 2
  • 31
  • 48
  • 1
    Indeed but this doesn't guarentee a full writeout to disk without an os.fsync() – deed02392 Mar 22 '12 at 15:01
  • 2
    Not all the way to disk, but if it's visible to other programs, that's what he's asking for here. os.fsync() is expensive, and it shouldn't be used unless you _really_ know you want it (and, typically, have a way for the user to turn it off). Note that even most databases have a way to disable fsync -- sometimes the user *wants* to risk data corruption to make things fast. – Charles Duffy Mar 22 '12 at 15:03
  • True, +1. Asker should also bear in mind differing behaviour between OS'. – deed02392 Mar 22 '12 at 15:04
4

You have to force the write, so I i use the following lines to make sure a file is written:

# Two commands together force the OS to store the file buffer to disc
    f.flush()
    os.fsync(f.fileno())
octopusgrabbus
  • 10,555
  • 15
  • 68
  • 131
Gil.I
  • 895
  • 12
  • 23
1

You will want to check out file.flush() - although take note that this might not write the data to disk, to quote:

Note: flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.

Closing the file (file.close()) will also ensure that the data is written - using with will do this implicitly, and is generally a better choice for more readability and clarity - not to mention solving other potential problems.

Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
1

This is a windows-ism. If you add an explicit .close() when you're done with file, it'll appear in explorer at that time. Even just flushing it might be enough (I don't have a windows box handy to test). But basically f.write does not actually write, it just appends to the write buffer - until the buffer gets flushed you won't see it.

On unix the files will typically show up as a 0-byte file in this situation.

Tyler Eaves
  • 12,879
  • 1
  • 32
  • 39
0

The data being written to the file is being buffered, and only when the buffer is full or the file is closed is the buffered data written to the file.

If you open a file in text mode, and you want the output to be flushed every time you finish writing a line, then set the buffering to line-buffered when opening the file:

f = open("log.txt", "a+", buffering=1)

See https://docs.python.org/3/library/functions.html#open for more details.

If you instead want to ensure that the file is up to date with what has been written thus far at a particular point in the program and don't mind the output being buffered in between such points, then you can instead explicitly flush the buffer at such points with

f.flush()

Flushing is also useful when using line buffering if you want a partial line to be flushed.

Logitude
  • 61
  • 4
0

File Handler to be flushed.

f.flush()
Siva Arunachalam
  • 7,582
  • 15
  • 79
  • 132
0

The file does not get written, as the output buffer is not getting flushed until the garbage collection takes effect, and flushes the I/O buffer (more than likely by calling f.close()).

Alternately, in your loop, you can call f.flush() followed by os.fsync(), as documented here.

f.flush()
os.fsync()

All that being said, if you ever plan on sharing the data in that file with other portions of your code, I would highly recommend using a StringIO object.

nesv
  • 786
  • 6
  • 10