4

I want to remove a file with os.remove(), and then do some work on the remaining files in the directory. However, I find that os.listdir() still includes erased files when they grow beyond a certain size. "Ok", I thought, "os.remove() just works asynchronously. No big deal, I´ll just use os.path.isfile() to check if the file has been completely removed yet". This turned out not to work. The following code exemplifies the problem:

import os

with open("test/test.txt", 'w') as file:
    for _ in range(100):
        file.write("spam")

print os.path.isfile("test/test.txt")
print os.listdir("test/")

os.remove("test/test.txt")

print os.path.isfile("test/test.txt")
print os.listdir("test/")

This creates a small file of 400 bytes. The output is as expected:

True
['test.txt']
False
[]

But when the number of "spam"s written is increased to 10 000 000 (a 40Mb file), the following output occurs:

True
['test.txt']
False
['test.txt']

So, isfile() is quite aware that the file has been erased, but listdir() hasn´t caught on yet.

Is there a more robust way of checking if a file exists, that will always agree with a following listdir() call?

Tested with Python 2.7 on Windows 7, should it matter.

----EDIT

I have no intention to open any files right away; I want to display all files remaining in the directory in a listbox. I feel opening every file to check if it´s there is uncalled for, but perhaps that is the pythonic way of doing things?

  • Python's general motto is [EAFP](http://stackoverflow.com/questions/11360858/what-is-the-eafp-principle-in-python); just blindly do whatever you want to do and catch errors. Doubly true when dealing with files. Also see the [`errno`](https://docs.python.org/3.4/library/errno.html) package if you want to slice up the `IOError`s you might catch – Nick T Nov 08 '14 at 01:08
  • I noticed that if I run it interactively, then it works the same for both files, as expected. So, I think it has something to do with the timing. Perhaps when you run it in a script, the remove operation has not yet completed before isfile and listdir run, and each reacts to the state differently. – David Nov 08 '14 at 01:13
  • I don't know how much this helps, but this code behaves like you'd expect on my Mac. So it could well be something about Windows file interactions specifically. – Erin Call Nov 08 '14 at 01:17
  • I tried this on windows 7 / python 2.7.2.5 and did not see your problem. The second `os.listdir` did not list any files. I would consider this a critical error. I tried it on a local drive and on a CIFS share to a linux samba server. What media are you using? Can anybody reproduce this?! – tdelaney Nov 08 '14 at 02:36

2 Answers2

2

In Windows, if you delete a file that's open by something else (with FILE_SHARE_DELETE), it's not actually deleted until it's closed. Instead, its entry remains there, marked as "delete-pending" (and you can't open it with an obscure error). I think this is the cause for the discrepancy - os.path.isfile sees that it exists but is not a "regular file" anymore.

If this is the case, os.path.exists should return True.

One possible solution in this case is to list all files (with an additional benefit of filtering out directories and such), i.e

print [f for f in os.listdir(path) if os.path.isfile(os.path.join(path,f))]

Of course, this gives an opportunity for a race condition when something changes while you're querying the entries. In this case, it's unavoidable unless you have an OS that supports transactions for system calls.

ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
  • 1
    Using [Process Monitor (ProcMon)](https://technet.microsoft.com/en-us/library/bb896645.aspx), I have just confirmed a situation where both `os.path.exists()` and `os.path.isdir()` both return `False` when ProcMon reports `DELETE PENDING` on the directory of interest. – DavidRR Mar 17 '15 at 17:50
0

You didn't mention what you wanted to do with the files, but in Python it's Easier to Ask Forgiveness than Permission, so try whatever and fail nice if something goes wrong.

import os
import glob
import errno

def process(file_handle):
    for line in file_handle:
        return line

for fn in glob.iglob('*.py'):
    try:
        with open(fn) as f:
            print(process(f))
    except IOError as e:
        if e.errno == errno.ENOENT:
            # file doesn't exist, ignore
            pass
        else:
            # some other error we don't want to suppress
            raise
Nick T
  • 25,754
  • 12
  • 83
  • 121