how to open (create if not exists) a file while acquiring exclusive lock avoiding races

Question

In python 2.7, is it possible (and how) to in a single atomic (race free) operation:

Open a file
- If it doesn't exists, create then open it.
Acquire a exclusive lock on the file (no other process can open or delete the file)

Context: I have a single python program that will fetch files given a list of URL/md5; If a file of the list exists and it's md5 matches, it gets skipped. If not, it will be downloaded. Now, there may be multiples instances of this program processing different lists which may overlap.

This question is almost what I need to do, but in my case I need to lock the file either way to check it's md5, while preventing others from doing so as well. Also, I need not to know whether the file existed or not prior to the operation; If it is just created the file will be empty and it's md5 won't match, so it will be downloaded anyway.

I'm using this program on Linux specifically, but cross-platform solutions are welcome.

EDIT: In the end I've solved my issue by:

opening the file in a+b mode (not atomic, create if doesn't exists).
Try to lock the file exclusively (advisory):
- If succeed, work on file.
- If failed, assume someone else is working on the file and skips to the next. After no more files to process, come back to check if whoever locked the file did the job right.

As it stands, the desired operation is not supported in a single atomic step, but not needed neither.

Answer to the question you refer states, that it is impossible to atomically open and lock file in Linux. Indeed, the only way of (non-advisory) locking file in Linux is `fcntl()`, but it cannot be combined with `open()` system call. But, is this atomicity is actually crucial? You can easily check file's content after aquiring lock on it. — Tsyvarev, Jun 20 '15 at 15:11
@Tsyvarev ...actually, you are right. I was biased in coupling the two operations because the current non-safe implementation I have couples them. But as you and pilcrow have said, the *file creation* needs not to be atomic, only *locking the file* needs to be. — gcscaglia, Jun 20 '15 at 16:50
@dhke yes I can, but the main reason for checking the md5 is to avoid downloading the same file twice while ensuring the download is complete and not corrupted in any way. If I change the filename, two instances would download the same file twice. — gcscaglia, Jun 20 '15 at 16:53
@gcscaglia Well, the `rename()` operation should be atomic (see also the answer below). Hence you should be to download to a temporary, check the checksum on the temporary file and move the file over to the correct location if and only if the download succeeded. Bonus objective: encode the md5 in the filename for easy lookup. This requires, however, that nobody improperly fiddles with your directory outside of your app. — dhke, Jun 20 '15 at 17:03
Note that another process can delete the file, even if it is exclusively locked. And other processes can open the file, even if it is exclusively locked. They may not be able to anything more (it depends on mandatory vs advisory locking), but they'll be able to do that much. — Jonathan Leffler, Jun 25 '15 at 14:14

score 1 · Answer 1 · edited Jun 25 '15 at 14:20

It is not possible, at least accord to this comprehensive report:

mv -T <oldsymlink> <newsymlink> atomically changes the target of <newsymlink> to the directory pointed to by <oldsymlink> and is indispensable when deploying new code. Updated 2010-01-06: both operands are symlinks. (So this isn’t a system call, it’s still useful.) A reader pointed out that ln -Tfs <directory> <symlink> accomplishes the same thing without the second symlink. Added 2010-01-06. Deleted 2010-01-06: strace(1) shows that ln -Tfs <directory> <symlink> actually calls symlink(2), unlink(2), and symlink(2) once more, disqualifying it from this page. mv -T <oldsymlink> <newsymlink> ends up calling rename(2) which can atomically replace <newsymlink>. Caveat 2013-01-07: this does not apply to Mac OS X, whose mv(1) doesn’t call rename(2). mv(1).

link(oldpath, newpath) creates a new hard link called newpath pointing to the same inode as oldpath and increases the link count by one. This will fail with the error code EEXIST if newpath already exists, making this a useful mechanism for locking a file amongst threads or processes that can all agree upon the name newpath. I prefer this technique for whole-file locking because the lock is visible to ls(1). link(2).

symlink(oldpath, newpath) operates very much like link(2) but creates a symbolic link at a new inode rather than a hard link to the same inode. Symbolic links can point to directories, which hard links cannot, making them a perfect analogy to link(2) when locking entire directories. This will fail with the error code EEXIST if newpath already exists, making this a perfect analogy to link(2) that works for directories, too. Be careful of symbolic links whose target inode has been removed ("dangling" symbolic links) — open(2) will fail with the error code ENOENT. It should be mentioned that inodes are a finite resource (this particular machine has 1,245,184 inodes). symlink(2). Added 2010-01-07

rename(oldpath, newpath) can change a pathname atomically, provided oldpath and newpath are on the same filesystem. This will fail with the error code ENOENT if oldpath does not exist, enabling interprocess locking much like link(oldpath, newpath) above. I find this technique more natural when the files in question will be unlinked later. rename(2).

open(pathname, O_CREAT | O_EXCL, 0644) creates and opens a new file. (Don’t forget to set the mode in the third argument!) O_EXCL instructs this to fail with the error code EEXIST if pathname exists. This is a useful way to decide which process should handle a task: whoever successfully creates the file. open(2).

mkdir(dirname, 0755) creates a new directory but fails with the error code EEXIST if dirname exists. This provides for directories the same mechanism link(2) open(2) with O_EXCL provides for files. mkdir(2). Added 2010-01-06; edited 2013-01-07.

As you see, open() can be used atomically only to create new files, not open existing ones for reading. If you want to use this approach though, you might want to use Python's os.open() which is a proxy for this syscall (not to confuse with built-in open()).

You might also consider using databases for this task, since they should offer much more reliability (for example, what if your files are hosted on NFS, which implements no locking at all and IIRC the only atomic operation there is mkdir()?).

Indeed, any file system with no locking would mean my program won't be race-safe; in such a situation creating a lockfile exclusively as pilcrow suggested seems the best option. Then again, databases seems a bit overkill; It's a simple program for personal use, it's just that my bandwidth is really limited so I want to avoid double usage for the same file. The file will need to be directly available on the disk later on anyway. Ty — gcscaglia, Jun 20 '15 at 17:19

score 1 · Accepted Answer · edited May 23 '17 at 11:43

1

No, it is not possible as a basic operation supported by Linux/UNIX.

The O_CREAT|O_EXCL technique in the answer you referenced can work here. Instead of exclusively creating the target file, you exclusively create a lockfile whose name is predictably derived from the target file. E.g., os.path.join("/tmp", hashlib.md5(target_filename).hexdigest() + ".lock").

However, as others have suggested, it's not clear that you need to protect both the target file creation and its checksumming + possible replacement. An fcntl advisory lock will suit your needs.

edited May 23 '17 at 11:43

Community

1
1

answered Jun 20 '15 at 15:33

pilcrow

56,591
13
94
135

I was biased in coupling the two operations, but as you said it's not really needed. From your answer I think creating the file non-exclusively and locking it before the checksumming will do just fine. I thought it was strange for such a create-open-lock operation not to exists, but thinking again I can't see a situation where it would be indispensable after all. Ty. – gcscaglia Jun 20 '15 at 17:01
I'm accepting your answer. Although both you and rr- stand right in that the operation is not possible, your answer also correctly identified it's not necessary at all. – gcscaglia Jun 25 '15 at 13:45

how to open (create if not exists) a file while acquiring exclusive lock avoiding races

2 Answers2