4

My issue is something like the following: to determine if two paths are identical on the windows platform, paths are compared case insensitive, ei. "C:\test.txt" and "C:\Test.txt" resolves to the same file element. I could solve this easily by using std::filesystem::equal for example, but for this particular problem I would like to save a bit on OS roundtripping (running on idle and doing 100+ compares on each loop - I am fearfull it is going to be noticeable)

using path = std::filesystem::path;
const bool result =  (path("C:\\test.txt").lexically_normal().make_preferred().native() == path("C:\\Test.txt").lexically_normal().make_preferred().native());

When comparing std::filesystem::path, even when lexically normalized by calling lexical_normal are done in the generic way and thus the case is considered. This makes sense of course, but aside from doing string compare myself I do not see a way to do this with the library without comparing: is it possible to somehow override how paths are compared ?

I also looked into boost::filesystem, but as far as i could see does not address the issue either.

darune
  • 10,480
  • 2
  • 24
  • 62
  • I believe it depends on the OS's filesystem. On Windows it is case insensitive and subsequently `std::filesystem::path` should be the same. On the other hand, on Linux filesystem it is case sensitive. – ALX23z Apr 21 '20 at 19:59

1 Answers1

1

The whole point of the path/filesystem distinction is to make a distinction between the path type as a generic mechanism for storing paths that can be manipulated and used with any filesystem, and the filesystem-specific operations that may differ based on particular implementations. That two non-equal path objects could be considered filesystem::equivalent by one filesystem but not by another is just a part of the deal.

There is no mechanism to do perform filesystem-aware path "normalization" that isn't a filesystem operation.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Rather than looking at this in terms of "filesystem awareness", I think it makes more sense to think in terms of application requirements. How can an application leverage filesystem::path to implement path semantics, while adding the constraint of case-insensitivity? For example, it may be necessary for a service (such as a git service) to block case-insensitive file path collisions, to support the least common denominator among its clients. – John Doggett Aug 01 '20 at 07:14
  • 1
    @JohnDoggett: The filesystem API is not there to allow you to access any filesystem; they exist to allow you to access a specific filesystem: the one provided by your OS. You can try to use the mechanisms with `path` outside of that context of course, but it's probably not going to be great at it. – Nicol Bolas Aug 01 '20 at 13:22
  • It sounds like I misunderstood your opening sentence. I thought you meant that filesystem and path were intended to be independently good at filesystem access and path semantics, respectively. Instead it sounds like the path class is only intended to be used in expression of real interactions with a filesystem. This is unfortunate, because so much of the work in applications is in modeling the *potential* to interact with filesystems, to plan and validate the work before committing it. That early modeling logic is where the fancy path semantics would have been useful. – John Doggett Aug 01 '20 at 21:54
  • @NicolBolas Do you not acknowledge the problem (basicly its impossible to compare FS elements semanticly for some FS types...) ? this answer could be improved a lot if you would suggest at least one solution to the original problem (since the answer is basicly this: "this is not possible"). – darune Aug 03 '20 at 09:48
  • @darune: Filesystem-specific comparison has to be part of the *filesystem*, not of the `path`. And if you want to implement that, you're going to have basically extract strings from those paths and compare them via strings, since you don't really have an effective way to access the internals of the `path`. – Nicol Bolas Aug 03 '20 at 13:28
  • @NicolBolas, [ISO/IEC TS 18822](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4100.pdf): **4.7 filename**: The following characteristics of filenames are operating system dependent: §6 Case awareness and sensitivity during path resolution **8.4.8 path compare:** §2 Returns: A value less than 0 if `native()` … **8.4.6 path native format observers** §1 The string returned by all native format observers is in the native pathname format. **4.11 native pathname format** §1 The operating system dependent pathname format accepted by the host operating system. – Aleksey F. Nov 14 '21 at 22:24
  • You can use `std::filesystem::canonical` with C++17: https://stackoverflow.com/a/56674900/2163727 – MasterHD Apr 21 '22 at 22:31
  • The cppreference.com description of `std::filesystem::canonical()` doesn't mention that it does anything to the case of the parameter, so I think it doesn't address this question. – Larry Engholm Feb 16 '23 at 17:23