20

How would I compare 2 strings to determine if they refer to the same path in Win32 using C/C++?

While this will handle a lot of cases it misses some things:

_tcsicmp(szPath1, szPath2) == 0

For example:

  • forward slashes / backslashes

  • relative / absolute paths.

[Edit] Title changed to match an existing C# question.

sashoalm
  • 75,001
  • 122
  • 434
  • 781
Adam Tegen
  • 25,378
  • 33
  • 125
  • 153
  • Does this answer your question? [What is the best way of determining that two file paths are referring to the same file object?](https://stackoverflow.com/questions/29497131/what-is-the-best-way-of-determining-that-two-file-paths-are-referring-to-the-sam) – Stypox Jun 15 '21 at 16:11

11 Answers11

36

Open both files with CreateFile, call GetFileInformationByHandle for both, and compare dwVolumeSerialNumber, nFileIndexLow, nFileIndexHigh. If all three are equal they both point to the same file:

GetFileInformationByHandle function

BY_HANDLE_FILE_INFORMATION Structure

MSN
  • 53,214
  • 7
  • 75
  • 105
  • 4
    Note files should stay open else it might be that equal numbers will correspond to different files. – jfs Mar 15 '09 at 11:22
  • 1
    The 2016 documentation of the BY_HANDLE_FILE_INFORMATION structure says "The 64-bit identifier in this structure is not guaranteed to be unique on ReFS." and "To retrieve the 128-bit file identifier use the GetFileInformationByHandleEx function with FileIdInfo to retrieve the FILE_ID_INFO structure." This approach works on Windows 8 and newer. It requires the compilation option -D_WIN32_WINNT=_WIN32_WINNT_WIN8. – Bruno Haible Dec 09 '16 at 15:40
  • Index numbers (inodes) will stay the same even after a handle is closed on local filesystems. There is only one problem with network drives. Samba has an option provide inodes that work across sessions (server restarts). – Lothar Jul 05 '19 at 16:58
9

Filesystem library

Since C++17 you can use the standard filesystem library. Include it using #include <filesystem>. You can access it even in older versions of C++, see footnote.

The function you are looking for is equivalent, under namespace std::filesystem:

bool std::filesystem::equivalent(const std::filesystem::path& p1, const filesystem::path& p2 );

To summarize from the documentation: this function takes two paths as parameters and returns true if they reference the same file or directory, false otherwise. There is also a noexcept overload that takes a third parameter: an std::error_code in which to save any possible error.

Example

#include <filesystem>
#include <iostream>
//...

int main() {
    std::filesystem::path p1 = ".";
    std::filesystem::path p2 = fs::current_path();
    std::cout << std::filesystem::equivalent(p1, p2);
    //...
}

Output:

1

Using filesystem before C++17

To use this library in versions prior to C++17 you have to enable experimental language features in your compiler and include the library in this way: #include <experimental/filesystem>. You can then use its functions under the namespace std::experimental::filesystem. Please note that the experimental filesystem library may differ from the C++17 one. See the documentation here.
For example:

#include <experimental/filesystem>
//...
std::experimental::filesystem::equivalent(p1, p2);
Stypox
  • 963
  • 11
  • 18
5

use the GetFullPathName from kernel32.dll, this will give you the absolute path of the file. Then compare it against the other path that you have using a simple string compare

edit: code

TCHAR buffer1[1000];
TCHAR buffer2[1000];
TCHAR buffer3[1000];
TCHAR buffer4[1000];

GetFullPathName(TEXT("C:\\Temp\\..\\autoexec.bat"),1000,buffer1,NULL);
GetFullPathName(TEXT("C:\\autoexec.bat"),1000,buffer2,NULL);
GetFullPathName(TEXT("\\autoexec.bat"),1000,buffer3,NULL);
GetFullPathName(TEXT("C:/autoexec.bat"),1000,buffer4,NULL);
_tprintf(TEXT("Path1: %s\n"), buffer1);
_tprintf(TEXT("Path2: %s\n"), buffer2);
_tprintf(TEXT("Path3: %s\n"), buffer3);
_tprintf(TEXT("Path4: %s\n"), buffer4);

the code above will print the same path for all three path representations.. you might want to do a case insensitive search after that

Gabe
  • 84,912
  • 12
  • 139
  • 238
user65157
  • 2,365
  • 2
  • 12
  • 3
  • 2
    This doesn't work because there can be more than 1 path to the same file – JaredPar Feb 18 '09 at 20:41
  • yes, but GetFullPathName will give you the absolute path to the same file.. so it should be same.. – user65157 Feb 18 '09 at 20:54
  • 9
    Hard links allow a single file to have two different names. And yes, those of you who know only enough to be a danger to yourself and others, Windows does support hard links. – Integer Poet Jul 27 '10 at 22:51
  • @IntegerPoet Came to say exactly that. Hard links are the devils playground. You need more than a path to determine sameness. – Mike McMahon Dec 09 '14 at 16:51
  • This also doesn't seem to account for directories with trailing slashes (i.e. c:\blah vs c:\blah\). – basiphobe Jul 06 '15 at 19:37
  • Hardlinks are the greatest invention since sliced bread. Only people who don't write programs that handle hardlinks are evil. Like Linus who wrote git without support for hardlinks. – Lothar Jul 05 '19 at 17:01
5

See this question: Best way to determine if two path reference to same file in C#

The question is about C#, but the answer is just the Win32 API call GetFileInformationByHandle.

Community
  • 1
  • 1
sth
  • 222,467
  • 53
  • 283
  • 367
  • 1
    The documentation for GetFileInformationByHandle says: "nFileIndexLow: Low-order part of a unique identifier that is associated with a file. This value is useful ONLY WHILE THE FILE IS OPEN by at least one process. If no processes have it open, the index may change the next time the file is opened." – Integer Poet Jul 27 '10 at 22:53
4

A simple string comparison is not sufficient for comparing paths for equality. In windows it's quite possible for c:\foo\bar.txt and c:\temp\bar.txt to point to exactly the same file via symbolic and hard links in the file system.

Comparing paths properly essentially forces you to open both files and compare low level handle information. Any other method is going to have flaky results.

Check out this excellent post Lucian made on the subject. The code is in VB but it's pretty translatable to C/C++ as he PInvoke'd most of the methods.

http://blogs.msdn.com/vbteam/archive/2008/09/22/to-compare-two-filenames-lucian-wischik.aspx

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • Updated URL for the link above: https://devblogs.microsoft.com/vbteam/to-compare-two-filenames-lucian-wischik/ Although this will probably only be valid for another week or so anyway, as Microsoft will inevitably break it for the 100th time due to being completely incompetent when it comes to maintaining a website and not breaking thousands of URLs. :-) – Leo Davidson Nov 10 '22 at 23:50
3

If you have access to the Boost libraries, try

bool boost::filesystem::path::equivalent( const path& p1, const path& p2 )

http://www.boost.org/doc/libs/1_53_0/libs/filesystem/doc/reference.html#equivalent

To summarize from the docs: Returns true if the given path objects resolve to the same file system entity, else false.

aldo
  • 2,927
  • 21
  • 36
  • The documentation behind the link above states what they are doing in boost. They compare the results of [stat()](http://pubs.opengroup.org/onlinepubs/000095399/basedefs/sys/stat.h.html) which is pretty easy to do yourself. In contrast to the most other solutions it is platform independent. – Johannes Jendersie Feb 18 '14 at 08:51
  • 1
    True. One could look up the source code and use it as a guide to implementing this small piece on your own. – aldo Feb 19 '14 at 16:12
3

Based on answers about GetFileInformationByHandle(), here is the code.

Note: This will only work if the file already exists...

//Determine if 2 paths point ot the same file...
//Note: This only works if the file exists
static bool IsSameFile(LPCWSTR szPath1, LPCWSTR szPath2)
{
    //Validate the input
    _ASSERT(szPath1 != NULL);
    _ASSERT(szPath2 != NULL);

    //Get file handles
    HANDLE handle1 = ::CreateFileW(szPath1, 0, FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); 
    HANDLE handle2 = ::CreateFileW(szPath2, 0, FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); 

    bool bResult = false;

    //if we could open both paths...
    if (handle1 != INVALID_HANDLE_VALUE && handle2 != INVALID_HANDLE_VALUE)
    {
        BY_HANDLE_FILE_INFORMATION fileInfo1;
        BY_HANDLE_FILE_INFORMATION fileInfo2;
        if (::GetFileInformationByHandle(handle1, &fileInfo1) && ::GetFileInformationByHandle(handle2, &fileInfo2))
        {
            //the paths are the same if they refer to the same file (fileindex) on the same volume (volume serial number)
            bResult = fileInfo1.dwVolumeSerialNumber == fileInfo2.dwVolumeSerialNumber &&
                      fileInfo1.nFileIndexHigh == fileInfo2.nFileIndexHigh &&
                      fileInfo1.nFileIndexLow == fileInfo2.nFileIndexLow;
        }
    }

    //free the handles
    if (handle1 != INVALID_HANDLE_VALUE )
    {
        ::CloseHandle(handle1);
    }

    if (handle2 != INVALID_HANDLE_VALUE )
    {
        ::CloseHandle(handle2);
    }

    //return the result
    return bResult;
}
Community
  • 1
  • 1
Adam Tegen
  • 25,378
  • 33
  • 125
  • 153
  • The documentation for GetFileInformationByHandle says: "nFileIndexLow: Low-order part of a unique identifier that is associated with a file. This value is useful ONLY WHILE THE FILE IS OPEN by at least one process. If no processes have it open, the index may change the next time the file is opened." – Integer Poet Jul 27 '10 at 22:54
  • 1
    @IntegerPoet: And your point is? You can't even call `GetFileInformationByHandle` unless the file is open, and this code doesn't close either handle between calls. – Ben Voigt Dec 31 '11 at 04:44
  • Good question. I wrote that comment so long ago that I no longer remember what I had in mind. – Integer Poet Jan 04 '12 at 00:15
  • What I must have meant is that the return value of IsSameFile (bResult) becomes stale even before IsSameFile returns. – Integer Poet Feb 09 '12 at 04:30
  • The 2016 documentation of the BY_HANDLE_FILE_INFORMATION structure says "The 64-bit identifier in this structure is not guaranteed to be unique on ReFS." and "To retrieve the 128-bit file identifier use the GetFileInformationByHandleEx function with FileIdInfo to retrieve the FILE_ID_INFO structure." This approach works on Windows 8 and newer. It requires the compilation option -D_WIN32_WINNT=_WIN32_WINNT_WIN8. – Bruno Haible Dec 09 '16 at 15:41
2

What you need to do is get the canonical path.

For each path you have ask the file system to convert to a canonical path or give you an identifier the uniquely identifies the file (such as the iNode).

Then compare the canonical path or the unique identifier.

Note: Do not try and figure out the conical path yourself the File System can do things with symbolic links etc that that are not easily tractable unless you are very familiar with the filesystem.

Martin York
  • 257,169
  • 86
  • 333
  • 562
0

If the files exist and you can deal with the potential race condition and performance hit from opening the files, an imperfect solution that should work on any platform is to open one file for writing by itself, close it, and then open it for writing again after opening the other file for writing. Since write access should only be allowed to be exclusive, if you were able to open the first file for writing the first time but not the second time then chances are you blocked your own request when you tried to open both files.

(chances, of course, are also that some other part of the system has one of your files open)

Matthew
  • 11
0

Open both files and use GetFinalPathNameByHandle() against the HANDLEs. Then compare the paths.

CodeAngry
  • 12,760
  • 3
  • 50
  • 57
0

Comparing the actual path strings will not produce accurate results if you refer to UNC or Canonical paths (i.e. anything other than a local path).

shlwapi.h has some Path Functions that may be of use to you in determing if your paths are the same.

It contains functions like PathIsRoot that could be used in a function of greater scope.

user62572
  • 1,388
  • 2
  • 15
  • 24