Python gives you access to underlying OS tools. Please review Atomic operations in UNIX.
Overall you have two requirements, atomicity and support for hard links. Also the referred answer mentions safety.
First is very narrowly satisfiable, but only if you drop safety, typically you'd use POSIX advisory locks, if every client uses these, you can have a very robust system, for example sqlite.
Mandatory locking is available, but not commonly enabled. Main sticking point with mandatory locks is priority inversion, that is non-privileged user can block a root process if they access same file.
Hard links implies you have to work on inode level. Any function in the above reference that operates on a file descriptor will work.
Atomic but not safe
A single write
system call is atomic up to a certain filesystem-dependent threshold. If you can afford to buffer your file data in memory (anonymous or mapped), you can atomically overwrite the file. For the sake of simplicity let's assume the file size is fixed.
Consider code below, it when two processes perform this action simultaneously, both writes start at offset 0, run in a single system call and in the end only one write "wins".
#!/usr/bin/env python
import sys
data = open(sys.argv[1], "rb").read()
fo = open(sys.argv[2], "rb+")
fo.seek(0)
fo.write(data)
While this is atomic, it is not inherently safe. write
could turn out to be partial (typically only if disk is full), or operating system could crash during write, leaving you with a target file that is neither source a nor b. If that's acceptable because you made a backup, do ahead and use it :)
P.S. If file size if not fixed, adopt a file format where file header specifies data size if the file.
P.P.S. Although sendfile
system call now works on regular files for both input and output, testing shows that operation is not atomic, here one thread tried to send 1000M zeros and another 1000M ff
's, the result is exactly 1000M but data gets interleaved, return value of one sendfile
s shows partial write, but size if inconsistent with actual zeros written:
(env33)[dima@bmg ~]$ hexdump oux
0000000 ffff ffff ffff ffff ffff ffff ffff ffff
*
03c0000 0000 0000 0000 0000 0000 0000 0000 0000
*
3e800000