19

Is it possible to check if a file has been deleted or recreated in python?

For example, if you did a open("file") in the script, and then while that file is still open, you do rm file; touch file;, then the script will still hold a reference to the old file even though it's already been deleted.

Mattie
  • 20,280
  • 7
  • 36
  • 54
user1502906
  • 193
  • 1
  • 4

3 Answers3

24

You should fstat the file descriptor for the opened file.

>>> import os
>>> f = open("testdv.py")
>>> os.fstat(f.fileno())
posix.stat_result(st_mode=33188, st_ino=1508053, st_dev=65027L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=1107, st_atime=1349180541, st_mtime=1349180540, st_ctime=1349180540)
>>> os.fstat(f.fileno()).st_nlink
1

Ok, this file has one link, so one name in the filesystem. Now remove it:

>>> os.unlink("testdv.py")
>>> os.fstat(f.fileno()).st_nlink
0

No more links, so we have an "anonymous file" that's only kept alive as long as we have it open. Creating a new file with the same name has no effect on the old file:

>>> g = open("testdv.py", "w")
>>> os.fstat(g.fileno()).st_nlink
1
>>> os.fstat(f.fileno()).st_nlink
0

Of course, st_nlink can sometimes be >1 initially, so checking that for zero is not entirely reliable (though in a controlled setting, it might be good enough). Instead, you can verify whether the file at the path you initially opened is the same one that you have a file descriptor for by comparing stat results:

>>> os.stat("testdv.py") == os.fstat(f.fileno())
False
>>> os.stat("testdv.py") == os.fstat(g.fileno())
True

(And if you want this to be 100% correct, then you should compare only the st_dev and st_ino fields on stat results, since the other fields and st_atime in particular might change in between the calls.)

Community
  • 1
  • 1
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
5

Yes. Use the os.stat() function to check the file length. If the length is zero (or the function returns the error "File not found"), then someone deleted the file.

Alternatively, you can open+write+close the file each time you need to write something into it. The drawback is that opening a file is a pretty slow operation, so this is out of the question if you need to write a lot of data.

Why? Because the new file isn't the file that you're holding open. In a nutshell, Unix filesystems have two levels. One is the directory entry (i.e. the file name, file size, modification time, pointer to the data) and the second level is the file data.

When you open a file, Unix uses the name to find the file data. After that, it operates only on the second level - changes to the directory entry have no effect on any open "file handles". Which is exactly why you can delete the directory entry: Your program isn't using it.

When you use os.stat(), you don't look at the file data but at the directory entry again.

On the positive side, this allows you to create files which no one can see but your program: Open the file, delete it and then use it. Since there is no directory entry for the file, no other program can access the data.

On the negative side, you can't easily solve problems like the one you have.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • 2
    On Linux you could look into `proc//fd/...` and access the data even if the file was deleted. That sometimes comes in handy if you want to make a copy of a video you are downloading from youtube ;-) – hochl Oct 02 '12 at 12:42
  • @hochl: Interesting. Note: To read the contents of the `fd` directory of a process, you need to be that user or root (permissions are `dr-x------`), so it's still secure. – Aaron Digulla Oct 02 '12 at 12:45
  • @AaronDigulla I just did a quick test, and I was able to do what I wanted with fstat (since stat still takes a filename) and looking at st_nlink (the number of hardlinks). I don't think the file length changes when it gets deleted. – user1502906 Oct 02 '12 at 12:52
  • Checking for the file length is very unreliable -- what if someone creates a file with the same name **and length**? See my answer for a more reliable approach. – Fred Foo Oct 02 '12 at 13:11
  • I agree, your answer catches more corner cases. – Aaron Digulla Oct 02 '12 at 13:34
  • @user1502906: You should not use the term "file" in your case; it's very confusing. Be specific. The file data of the open file doesn't change but `stat()` should check the directory entry (i.e. the new file) and when you touch the new file, the new file data should have the length 0. – Aaron Digulla Oct 02 '12 at 13:36
3

Yes -- you can use the inotify facility to check for file changes and more. There also is a Python binding for it. Using inotify you can watch files or directories for filesystem activiy. From the manual the following events can be detected:

IN_ACCESS         File was accessed (read) (*).
IN_ATTRIB         Metadata changed, e.g., permissions, timestamps, extended attributes, link count (since Linux 2.6.25), UID, GID, etc. (*).
IN_CLOSE_WRITE    File opened for writing was closed (*).
IN_CLOSE_NOWRITE  File not opened for writing was closed (*).
IN_CREATE         File/directory created in watched directory (*).
IN_DELETE         File/directory deleted from watched directory (*).
IN_DELETE_SELF    Watched file/directory was itself deleted.
IN_MODIFY         File was modified (*).
IN_MOVE_SELF      Watched file/directory was itself moved.
IN_MOVED_FROM     File moved out of watched directory (*).
IN_MOVED_TO       File moved into watched directory (*).
IN_OPEN           File was opened (*).

From here you can google yourself a solution, but I think you get the overall idea. Of course this may only work on Linux, but from your question I assume you are using it (references to rm and touch).

hochl
  • 12,524
  • 10
  • 53
  • 87