0

Basically I was looking for a way to read from a file that is being constantly written to in Python, e.g. log file.

I found this: https://stackoverflow.com/a/5420116/10638608

And the answer is:

import time
def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print line,

I started asking myself why does it work. I mean okay, theoretically, seeking to the end, waiting till the line comes up, reading it and printing it - pretty simple, right? But why is it possible that two processes operate on the same file, one writes to it (and closes it), one reads from it, and the thefile handle is not corrupted in any way and is able to read next line from the file? Wouldn't this be a critical section of some sort?

martineau
  • 119,623
  • 25
  • 170
  • 301
dabljues
  • 1,663
  • 3
  • 14
  • 30
  • reading doesn't modify the file that's why it will work – deadshot May 26 '20 at 23:37
  • I know it doesn't, but isn't keeping the file open (therefore keeping a handle to it) also a problem? – dabljues May 26 '20 at 23:42
  • 1
    I guess its a question of what "corrupted" means. Suppose the other program writes a line in several different write calls. The reader may get partial lines. It can also happens if the writer keeps the file open for many line writes. Its C library will buffer those writes and dump them on a block boundary. This works best if the writer keeps closing the file. – tdelaney May 26 '20 at 23:44
  • Windows is a problem unless the file has been opened in shared mode, but linux will allow any number of readers and one writer. As for corruption, at the operating system level, files are blocks that are first read / written into a cache in ram before going to the disk. Multiblock writes may not be atomic. – tdelaney May 26 '20 at 23:47
  • So as a general rule: a process can read from a file that's already being written to, right? No matter which programming language etc? Keeping the file open ofc (by the reader) – dabljues May 26 '20 at 23:54

0 Answers0