42

I'm working on a python script that will be accessed via the web, so there will be multiple users trying to append to the same file at the same time. My worry is that this might cause a race condition where if multiple users wrote to the same file at the same time and it just might corrupt the file.

For example:

#!/usr/bin/env python

g = open("/somepath/somefile.txt", "a")
new_entry = "foobar"
g.write(new_entry)
g.close

Will I have to use a lockfile for this as this operation looks risky.

Amir
  • 10,600
  • 9
  • 48
  • 75
Ray Y
  • 1,261
  • 3
  • 16
  • 24
  • 1
    Maybe you can just use syslog? – Keith Aug 07 '12 at 20:25
  • If you are on Linux or other Unix `mkfifo` may be an interesting option. `mkfifo` creates a FIFO special file. Anyone can write to the file at random, then one single process reads out of the FIFO. That way you don't need to use file locking. – Kenji Noguchi Aug 07 '12 at 20:43
  • If you open with `O_APPEND`, the target filesystem is POSIX-compliant, and your writes are all short enough to be accomplished in a single syscall, there will be no corruption in the first place. – Charles Duffy Mar 05 '20 at 13:32

4 Answers4

54

You can use file locking:

import fcntl
new_entry = "foobar"
with open("/somepath/somefile.txt", "a") as g:
    fcntl.flock(g, fcntl.LOCK_EX)
    g.write(new_entry)
    fcntl.flock(g, fcntl.LOCK_UN)

Note that on some systems, locking is not needed if you're only writing small buffers, because appends on these systems are atomic.

Community
  • 1
  • 1
phihag
  • 278,196
  • 72
  • 453
  • 469
  • 2
    Nice answer, why would you need to do a g.seek(0,2) here to go to EOF. Won't append just add to the end of the file? – Ray Y Aug 07 '12 at 20:41
  • 1
    Oh, you're right. At least on Linux, it's not required (I imagined an OS that implements the `a` mode by initially seeking to EOF). I was also playing with the idea of opening the file with another mode but `a`, but that's apparently not possible in Python. – phihag Aug 07 '12 at 20:49
  • 3
    What would happen if a user tried to append to the file but the file was locked by flock? Error? – Ray Y Aug 07 '12 at 20:51
  • 3
    @RayY No, the process (or more precisely, the current thread) just blocks until the lock is released. For more information, refer to [`man 2 flock`](http://www.kernel.org/doc/man-pages/online/pages/man2/flock.2.html) – phihag Aug 07 '12 at 20:53
  • @phihag So if it's blocked, do you need adjust your code so it keeps retrying to append until the lock is released? – Flint Jun 18 '14 at 17:18
  • 3
    @Flint No, blocked means that `flock` will not return until it has acquired a lock. – phihag Jun 19 '14 at 08:04
  • fcntl is for Unix only, does not exist on Windows. Any suggestions for Windows? – Anatoly Alekseev May 03 '18 at 05:00
  • @AnatolyAlekseev See [this answer for the Windows equivalent](https://stackoverflow.com/a/30440983/35070). – phihag May 03 '18 at 10:12
  • 2
    Can this happen: process A and B both open the file in append mode. A gets the lock, writes, then unlocks. B gets now the lock and writes over what was written by A, since when `open` was called, the end of the file was the line which is now written to by A. I am trying to write to a file with 175 processes, and some lines are messed up, even though they print correctly in the individual log files belonging to each process, so I was wondering if this might be the issue. Do I need to call seek or something to move to the end of the file for B? (I'm on Debian 10, Python 3.7.5) – GoodDeeds Oct 10 '21 at 16:55
  • 1
    @GoodDeeds No, append really appends all the time. Are you sure you are locking everywhere you write? You can check with `strace`. Another potential culprit could be line buffering. Assuming you write out lines at once, you might want to turn that off. – phihag Oct 11 '21 at 00:16
  • Thank you, I tried with the `-u` flag to disable buffering, but that didn't help. Could you please point me to a reference on how to use strace on this? – GoodDeeds Oct 11 '21 at 00:29
  • 1
    @GoodDeeds `-u` only affects stdin, stdout, stderr. You want `buffering=0` in the open call, although I believe it also should work by default. You can pick one (or all) of your 175 processes and prefix the command line to it with `strace -ff -o log` to see what they are doing. There is no specific reference for using strace for your problem; strace is a generic tool. – phihag Oct 11 '21 at 00:33
  • 3
    Thank you very much for your help! I could not use `buffering=0` since I was not opening the file in binary mode, but using `flush` on the file descriptor before releasing the lock resolved the issue. – GoodDeeds Oct 11 '21 at 01:12
5

You didn't state what platform you use, but here is an module you can use that is cross platform: File locking in Python

Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
Qiau
  • 5,976
  • 3
  • 29
  • 40
5

If you are doing this operation on Linux, and the cache size is smaller than 4KB, the write operation is atomic and you should be good.

More to read here: Is file append atomic in UNIX?

user1767754
  • 23,311
  • 18
  • 141
  • 164
1

Depending on your platform/filesystem location this may not be doable in a safe manner (e.g. NFS). Perhaps you can write to different files and merge the results afterwards?

Karol Nowak
  • 662
  • 3
  • 8