0

Hello I want to get the file id from files on windows with python. When I searched I could only find how to do it in other languages. Does anybody know how I can achieve this in python?

shiteatlife
  • 57
  • 2
  • 12
  • So are you expecting ids of files or the contents inside the file ? – Vikas Periyadath Jan 24 '18 at 07:04
  • Only the id's and I would like to be able to get the files later with the id's. – shiteatlife Jan 24 '18 at 07:12
  • The files in a particular directory ? – Vikas Periyadath Jan 24 '18 at 07:18
  • I want to get all the file id's from my current directory. – shiteatlife Jan 24 '18 at 07:21
  • My idea was to get files by there id's so that if they change directory I could still find the file and get them by their id's. – shiteatlife Jan 24 '18 at 07:31
  • In Python 3 `os.stat` calls `GetFileInformationByHandle` and returns the file number as `st_ino`. But calling `OpenFileById` requires ctypes or PyWin32. – Eryk Sun Jan 24 '18 at 10:01
  • @eryksun I have looked for `st_ino`. I got a number. But that also changing when the files get edited and in some cases. I have tested with a program and also read about that `inode` number. It actually not represents the file unique id. I think there is no such file id available – Vikas Periyadath Jan 24 '18 at 10:08
  • @VikasDamodar, it's not an inode number. Windows doesn't have inodes. It's just using the `st_ino` field as the closest analog. The file number should be stable for NTFS, and the same for all hard links, unless of course the file is completely deleted and replaced with a new file that has the same name. In FAT32, the file number is not stable since it's based on the first cluster number of the parent directory and byte offset of the file entry in the directory. For example, disk defragmenting programs can change FAT32 file numbers (the built-in defrag in Windows does not). – Eryk Sun Jan 24 '18 at 10:24
  • @VikasDamodar, also, Python 3 uses the 32-bit volume serial number for `st_dev`, since that's the closest, dependable analog to a Unix device number. However, like the file number, the VSN isn't necessarily unique or non-zero. It's fine for NTFS, except for the rare case that two volumes have the same VSN. – Eryk Sun Jan 24 '18 at 10:33
  • 1
    I had actually tried stat like this; `st = os.stat(file)` but this returns 0L, when i try the same code on other files it also gives 0L. I checked and my drive is NTFS. Full result down below. `(nt.stat_result(st_mode=33206, st_ino=0L, st_dev=0L, st_nlink=0, st_uid=0, st_gid=0, st_size=855164972L, st_atime=1516774678L, st_mtime=1516775601L, st_ctime=1516774678L))` – shiteatlife Jan 24 '18 at 12:38
  • 1
    @eryksun After searching a bit I found that if you use python2 then it only shows dummy data, so I tried it with python3 and now I do get the st_ino! Do you know if I can get/open a file with the st_ino or get the location of the file? – shiteatlife Jan 24 '18 at 14:37
  • @Vural, a file or directory can be opened by file number via [`OpenFileById`](https://msdn.microsoft.com/en-us/library/aa365432). This can be called via PyWin32's `win32file.OpenFileById` module or via ctypes. Since the file number isn't unique system wide, this function also requires a handle for a file or directory on the volume, or the volume device itself (e.g. `r"\\.\C:"`). You can get the underlying Windows File handle for a C file descriptor via `msvcrt.get_osfhandle(fd)`. If you use built-in `open` to get a file object, the fd is `f.fileno()`. `os.open` returns an fd directly. – Eryk Sun Jan 24 '18 at 18:58

2 Answers2

2

As far as I have looked and researched, there is no such file id available. But instead, you can have the creation date on Windows and Mac, and the last modified on Linux. These two are usually sufficient to find unique files, even if they are renamed, altered, or whatever.

Here's how to do it, along with the source SO thread I found the solution.

import os
import platform

def creation_date(path_to_file):
    """
    Try to get the date that a file was created, falling back to when it was
    last modified if that isn't possible.
    See http://stackoverflow.com/a/39501288/1709587 for explanation.
    """
    if platform.system() == 'Windows':
        return os.path.getctime(path_to_file)
    else:
        stat = os.stat(path_to_file)
        try:
            return stat.st_birthtime
        except AttributeError:
            # We're probably on Linux. No easy way to get creation dates here,
            # so we'll settle for when its content was last modified.
            return stat.st_mtime
John
  • 2,012
  • 2
  • 21
  • 33
IMCoins
  • 3,149
  • 1
  • 10
  • 25
0
import os

path_to_file = r"path_to_your_file"
file_id = os.stat(path_to_file, follow_symlinks=False).st_ino
print(hex(file_id))

to check the result from the commandline:

c:\> fsutil file queryfileid path_to_your_file

so in Python you can also use

print(os.popen(fr"fsutil file queryfileid path_to_your_file").read())

or when you have hardlinks:

print(os.popen(fr"fsutil hardlink list path_to_your_file").read())

to find the filename with an id:

print(os.popen(fr'fsutil file queryFileNameById c:\ the_file_id').read())

ingo
  • 117
  • 10