12

Possible Duplicate:
Best way to determine if two path reference to same file in C/C++

Given two file path strings with potentially different casing and slashes ('\' vs '/'), is there a quick way (that does not involve writing my own function) to normalize both paths to the same form, or at least to test them for equivalence?

I'm restricted to WinAPI and standard C++. All files are local.

Community
  • 1
  • 1
Alex B
  • 82,554
  • 44
  • 203
  • 280

4 Answers4

7

I found a blog posting with the most thorough, even elaborate, function I have ever seen to solve this problem. It handles anything, even horrible corner cases like V:foo.txt where you used the subst command to map V: to Z: but you already used subst to map Z: to some other drive; it loops until all subst commands are unwound. URL:

http://pdh11.blogspot.com/2009/05/pathcanonicalize-versus-what-it-says-on.html

My project is pure C code, and that function is C++. I started to translate it, but then I figured out that I could get the normalized path that I wanted with one function call: GetLongPathName(). This won't handle the horrible corner cases, but it handled my immediate needs.

I discovered that GetLongPathName("foo.txt") just returns foo.txt, but just by prepending ./ to the filename I got the expansion to normalized form:

GetLongPathName("./foo.txt"), if executed in directory C:\Users\steveha, returns C:\Users\steveha\foo.txt.

So, in pseudocode:

if the second char of the pathname is ':' or the first char is '/' or '\', just call GetLongPathName() else, copy "./" to a temp buffer, then copy the filename to temp buffer + 2, to get a copy of the filename prepended with "./" and then call GetLongPathName().

steveha
  • 74,789
  • 21
  • 92
  • 117
  • Well, at least that post provides code that looks like it does all the heavy lifting. It certainly handles more corner cases and more completely that you'd expect. Clearly the author has burned his fingers on these issues a few times... Your simpler fallback is probably good enough for most cases, and lets the exotic cases be used to fool software that needs fooling. – RBerteig Jun 30 '11 at 00:47
  • My users have probably never even heard of the `subst` command, and my simple C code has worked perfectly so far. – steveha Jul 07 '11 at 05:49
  • the big surprise to us old timers is that `subst` still exists... it is even present in Win7 64bit, but it dates back to DOS 5.0 if not earlier. Win7 comes with a bunch of JUNCTIONS for backward compatibility. That is giving some of my older Perl scripts that wander through the file system fits as they aren't quite directories and certainly aren't files. One example is that `C:\Documents and Settings` is a JUNCTION mapped to `C:\Users`. – RBerteig Jul 07 '11 at 22:38
  • 2
    Why not `GetFullPathName()`? https://msdn.microsoft.com/en-us/library/windows/desktop/aa364963(v=vs.85).aspx – Serge Rogatch Jul 06 '15 at 14:31
  • I answered this question four years ago, and I no longer remember why I used `GetLongPathName()` rather than `GetFullPathName()`. That blog posting from 2009 that I linked shows code that uses `GetFullPathName()`, and if it's good enough for that guy, I'm sure it's a good way to go. – steveha Jul 07 '15 at 07:16
7

Depending on whether the paths could be relative, or contain "..", or junction points, or UNC paths this may be more difficult than you think. The best way might be to use the GetFileInformationByHandle() function as in this answer.

Edit: I agree with the comment by RBerteig that this may become hard to impossible to do if the paths are not pointing to a local file. Any comment on how to safely handle this case would be greatly appreciated.

Community
  • 1
  • 1
mghie
  • 32,028
  • 6
  • 87
  • 129
  • As long as the two paths resolve to files on the same computer, then it looks like GetFileInformationByHandle() is the right answer. If they resolve to different computers, I don't see a guarantee, and I don't see a trivial way to get one, either. It isn't necessarily easy to test for this. – RBerteig Mar 26 '09 at 07:41
  • All files are local in my case, so this works. – Alex B Mar 26 '09 at 08:59
  • @RBerteig: I don't see a trivial way to get one, either. But I found a very non-trivial one and put it in an answer; take a look. Even that one is only mostly foolproof, but it ought to be more than enough for most people. – steveha Jun 29 '11 at 22:00
7

May I suggest PathCanonicalize?

i_am_jorf
  • 53,608
  • 15
  • 131
  • 222
  • It doesn't look like that addresses either Junction points or UNC paths... but it does look useful to know about. – RBerteig Mar 26 '09 at 07:42
  • That's the method I was looking for. Not GetFullPathName. – Jim Mischel Mar 26 '09 at 11:54
  • 7
    After viewing this answer, I tried using PathCanonicalize() and discovered that it's horribly broken. `PathCanonicalize("../foo.txt")` always returns `/foo.txt`! PathCanonicalize() just does trivial editing on the string, and the above broken-ness is documented behavior. Useless. I will post another answer with what I found. – steveha Jun 29 '11 at 07:12
  • Why should it bother handling junction points? That's done by the object manager. – 0xC0000022L Nov 24 '21 at 13:40
3

There are odd cases. For example, "c:\windows..\data\myfile.txt" is the same as "c:\data.\myfile.txt" and "c:\data\myfile.txt". You can have any number of "\.\" and "\..\" in there. You might look into the Windows API function GetFullPathName. It might do canonicalization for you.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351