0

I have a file which I'm atomically replacing in Python, while trying to persistently retain a lock.

(Yes, I'm well aware that this will wreak havoc on any other programs waiting for a lock on the file unless they check for the directory entry pointing to a new inode after they actually receive their lock; that check is happening in practice).

import os, os.path, tempfile, fcntl

def replace_file(f, new_text):
    f_dir = os.path.dirname(f.name)
    with tempfile.NamedTemporaryFile(dir=f_dir) as temp_file:
         temp_file.write(new_text)
         temp_file.flush()
         os.fsync(temp_file.fileno())
         dest_file = os.fdopen(os.dup(temp_file.fileno()), 'r+b')
         fcntl.flock(dest_file.fileno(), fcntl.LOCK_EX)
         os.rename(temp_file.name, f.name)
         temp_file.delete = False
    # ...and after more paranoia, like fsync()ing the directory it's in...
    return dest_file

f = open('/tmp/foo', 'w')
f = replace_file(f, "new string")
print f.name # name is <fdup>, not /tmp/foo

I'm hard-pressed to find a workaround for this that doesn't involve dropping the lock even temporarily after the rename has taken place.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • (re: obvious question -- why use atomic renames *and* flock? -- there are multiple edit modes available; in some cases we're doing appends that should be treated as atomic [hence flock], and in others we're replacing the file in full, and don't want to allow data loss even in event of a power loss [hence replace-and-rename]). – Charles Duffy Oct 18 '16 at 18:43
  • `os.rename` is an atomic operation if source and destination are on the same filesystem (only node rename). Why do you need locking? – Laurent LAPORTE Oct 18 '16 at 18:58
  • @LaurentLAPORTE, see the very first comment on the question. The locks aren't needed for this case, but for scenarios where we're doing in-place updates. – Charles Duffy Oct 18 '16 at 19:25

2 Answers2

0

If your code is under Linux, here's a way to get filename from a file descriptor:

...
f = replace_file(f, "new string")
print os.readlink('/proc/self/fd/%d' % f.fileno())

Reference: https://stackoverflow.com/a/1189582/2644759

Community
  • 1
  • 1
Philip Tzou
  • 5,926
  • 2
  • 18
  • 27
  • Hmm. So the proposal is to take all code that references `f.name`, and do a lookup via procfs instead? Inasmuch as using this answer requires modifying everything that uses the `replace_file` function, not the function itself, I'd almost be tempted to build a wrapper for the file object's type and return that. – Charles Duffy Oct 18 '16 at 19:27
  • ...actually, that wrapper could do it better: By allowing the `file` attribute to be overridden explicitly (it's read-only on the base type), it wouldn't need the procfs lookup at all, but could accept it as an assertion by the code doing the creation. – Charles Duffy Oct 18 '16 at 19:29
  • BTW -- are you sure the procfs lookup would return `/tmp/foo`, not `/tmp/random-tempfile-name (deleted)`? – Charles Duffy Oct 18 '16 at 19:30
  • @CharlesDuffy I'm not sure since I didn't test it. It seems you have a high throughput system that there's not too much can be touched in `replace_file` function. Or you can just close the old `fd` and re-open a fresh new one when returning. – Philip Tzou Oct 18 '16 at 19:42
  • Closing and re-opening drops the lock. – Charles Duffy Oct 18 '16 at 19:43
  • @CharlesDuffy I see. Also I tested `readlink` after renaming, the file path did get updated so you don't need to worry it won't return the correct file path. Write a light weight wrapper seems like a much good idea. – Philip Tzou Oct 18 '16 at 20:52
0

A simplicity-focused solution is to use an entirely separate lockfile with a different name (ie. <filename>.lck).

  • Using a separate lockfile means that the code performing the write-and-rename operation doesn't need to be involved in locking at all, as the rename operation doesn't interact with the lock.
  • Using a separate lockfile avoids the need to jump through hoops to avoid breaking clients which might find that they've successfully grabbed a lock, but hold it on a now-deleted file rather than the version currently bound to the directory.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441