0

I need to uniquely identify a file on Windows so I can always have a reference for that file even if it's moved or renamed. I did some research and found the question Unique file identifier in windows with a way that uses the method GetFileInformationByHandle with C++, but apparently that only works for NTFS partitions, but not for the FAT ones.

I need to program a behavior like the one on DropBox: if you close it on your computer, rename a file and open it again it detects that change and syncs correctly. I wonder whats the technique and maybe how DropBox does if you guys know.

FileSystemWatcher for example would work, but If the program using it is closed, no changes can be detected.

I will be using C#.

Thanks,

Community
  • 1
  • 1
Joao de Araujo
  • 1,096
  • 2
  • 15
  • 27
  • Why not hash the files data and compare hashes? – Cole Tobin Jul 21 '12 at 23:13
  • The data could change in that meantime. – Joao de Araujo Jul 21 '12 at 23:14
  • If data changes how would you even begin to try to detect that they are the same file? – Joey Jul 21 '12 at 23:15
  • 3
    I still don't get it yet: If a file has changed its name, and its contents, why would you consider it identical to what it was before? – tiwo Jul 21 '12 at 23:19
  • Not identical, but I need to keep track of that file while it exists. I need to know what's the file while it exists, even it has been renamed, moved or had it's content changed. – Joao de Araujo Jul 21 '12 at 23:23
  • What if the file is moved to a device which becomes unavailable later on? What you're trying to achieve sounds like bad design to me. Otherwise you can always use FileSystemWatcher in a resident process (or a service), that should work. – Thomas Jul 21 '12 at 23:34
  • In that the design all the files/folders will be on the same root folder. I will be using FileSystemWatcher I guess. But having that unique file id would be more secure. – Joao de Araujo Jul 21 '12 at 23:58

2 Answers2

4

The next best method (but one that involves reading every file completely, which I'd avoid when it can be helped) would be to compare file size and a hash (e.g. SHA-256) of the file contents. The probability that both collide is fairly slim, especially under normal circumstances.

I'd use the GetFileInformationByHandle way on NTFS and fall back to hashing on FAT volumes.

In Dropbox' case I think though, that there is a service or process running in background observing file system changes. It's the most reliable way, even if it ceases to work if you stop said service/process.

Joey
  • 344,408
  • 85
  • 689
  • 683
0

What the user was looking for was most likely Windows Change Journals. Those track changes like renames of files persistently, no need to have a watcher observing file system events running all the time. Instead, one simply needs to maintain when last looked at the log and continue looking again beginning at that point. At some point a file with an already known ID would have an event of type RENAME and whoever is interested in that event could do the same for its own version of that file. The important thing is to keep track of the used IDs for files of course.

An automatic backup application is one example of a program that must check for changes to the state of a volume to perform its task. The brute force method of checking for changes in directories or files is to scan the entire volume. However, this is often not an acceptable approach because of the decrease in system performance it would cause. Another method is for the application to register a directory notification (by calling the FindFirstChangeNotification or ReadDirectoryChangesW functions) for the directories to be backed up. This is more efficient than the first method, however, it requires that an application be running at all times. Also, if a large number of directories and files must be backed up, the amount of processing and memory overhead for such an application might also cause the operating system's performance to decrease.

To avoid these disadvantages, the NTFS file system maintains an update sequence number (USN) change journal. When any change is made to a file or directory in a volume, the USN change journal for that volume is updated with a description of the change and the name of the file or directory.

https://learn.microsoft.com/en-us/windows/win32/fileio/change-journals

Thorsten Schöning
  • 3,501
  • 2
  • 25
  • 46