os.rename, os.replace and shutil.move errors on windows 10

Question

I'm trying to implement simple file locking using renaming on windows 10. I've got the following test program that renames a file to lock it, then opens and reads it, and renames it to unlock it. However, I'm seeing intermittent errors when I run two of these simultaneously using different arguments (e.g. test.py 1, test.py 2)

import sys
import os
from time import sleep
import shutil

def lockFile():
    while True:
        try:
            os.replace("testfile", "lockfile"+sys.argv[1])
            if(os.path.exists("lockfile"+sys.argv[1])):
                print("successfully locked", flush=True)
                print(os.stat("lockfile"+sys.argv[1]))
            else:
                print("failed to lock", flush=True)
                raise BaseException()
            return
        except:
            print("sleeping...", flush=True)
            sleep(1)

def unlockFile():
    while True:
        try:
            os.replace("lockfile"+sys.argv[1], "testfile")
            if(os.path.exists("testfile")):
                print("successfully unlocked", flush=True)
            else:
                print("failed to unlock", flush=True)
                raise BaseException()
            return
        except:
            print("sleeping...", flush=True)
            sleep(1)

while True:
    lockFile()
    if(os.path.exists("lockfile"+sys.argv[1])):
        print("file is available", flush=True)
    else:
        print("file is not available", flush=True)
    with open(("lockfile"+sys.argv[1])) as testFile:
        contents = testFile.read()
        print(contents.rstrip(), flush=True)
    unlockFile()

What I'm seeing is that occasionally the rename/replace/move doesn't throw an exception, os.path.exists says the locked file is present, I can stat the locked file, and then suddenly the locked file is gone and I can't open it:

successfully locked
os.stat_result(st_mode=33206, st_ino=9288674231797231, st_dev=38182903, st_nlink=1, st_uid=0, st_gid=0, st_size=12, st_atime=1536956584, st_mtime=1536956584, st_ctime=1536942815)
file is not available
Traceback (most recent call last):
  File "test.py", line 41, in <module>
    with open(("lockfile"+sys.argv[1])) as testFile:
FileNotFoundError: [Errno 2] No such file or directory: 'lockfile2'

I'm assuming that test.py 1 & 2 both contain the above code so my first thought is that there is an issue with synchronization. Have you tried starting one before the other? I'm thinking that they're both running simultaneously and since they are both accessing the same file it may be possible for them to both get past the test condition before the other has a change to change the file. — K-Log, Sep 14 '18 at 21:18
I start one and then the other. They are renaming to different files. So theoretically even if both of them did the rename without exception, only one of them should be able to do the os.path.exists successfully. — Patrick Einheber, Sep 14 '18 at 21:25
What if you moved the `with open ...` block to be within the file is available block to make sure you don't try to open a file that doesn't exist? — K-Log, Sep 14 '18 at 21:29
By that point, the previous os.path.exists in lockFile() verified the file exists and I already did a stat on it. So a second os.path.exists shouldn't then fail. — Patrick Einheber, Sep 14 '18 at 21:33

score 1 · Answer 1 · answered Sep 14 '18 at 21:44

I think part of the problem is that os.path.exists lies

Directories cache file names to file handles mapping. The most common problems with this are:

•You have an opened file, and you need to check if the file has been replaced by a newer file. You have to flush the parent directory's file handle cache before stat() returns the new file's information and not the opened file's.

◦Actually this case has another problem: The old file may have been deleted and replaced by a new file, but both of the files may have the same inode. You can check this case by flushing the open file's attribute cache and then seeing if fstat() fails with ESTALE.

•You need to check if a file exists. For example a lock file. Kernel may have cached that the file does not exist, even if in reality it does. You have to flush the parent directory's negative file handle cache to to see if the file really exists.

So sometimes when your function is checking to see if the path exists in the lockFile() function, it doesn't actually exist.

Not sure how to flush the file handle cache. The Python os module documentation is littered with the phrase: "Call os.stat() to fetch up-to-date information." But even adding that doesn't seem to do the trick. — Patrick Einheber, Sep 17 '18 at 20:07

score 0 · Accepted Answer · answered Sep 17 '18 at 21:11

Ok, based on the post linked above, os.path lies, I cobbled together a solution. This may still just be lucky timing and is only for Windows at this point. If I change the subprocess.Popen to rename/replace or omit the os.stat before doing the os.path.exists check then it doesn't work. But this code doesn't seem to hit the problem. Tested with 5 simultaneous scripts running and without sleep calls.

def lockFile():
    while True:
        try:
            p = subprocess.Popen("rename testfile lockfile"+sys.argv[1], shell=True,
                                 stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
            result = p.wait()
            statresult = os.stat("lockfile"+sys.argv[1])
            if(os.path.exists("lockfile"+sys.argv[1])):
                print("successfully locked", flush=True)
                print(os.stat("lockfile"+sys.argv[1]), flush=True)
            else:
                print("failed to lock", flush=True)
                raise BaseException()
            return
        except BaseException as err:
            print("sleeping...", flush=True)
            #sleep(1)

def unlockFile():
    while True:
        try:
            p = subprocess.Popen("rename lockfile"+sys.argv[1] + " testfile", shell=True,
                                 stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
            result = p.wait()
            statresult = os.stat("testfile")
            if(os.path.exists("testfile")):
                pass
            else:
                print("failed to unlock", flush=True)
                raise BaseException()
            return
        except BaseException as err:
            print("sleeping...", flush=True)
            #sleep(1)

os.rename, os.replace and shutil.move errors on windows 10

2 Answers2