60

How can one with minimal effort (using some already existing facility, if possible) convert paths like c:\aaa\bbb\..\ccc to c:\aaa\ccc?

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
mark
  • 59,016
  • 79
  • 296
  • 580

4 Answers4

75

I would write it like this:

public static string NormalizePath(string path)
{
    return Path.GetFullPath(new Uri(path).LocalPath)
               .TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar)
               .ToUpperInvariant();
}

This should handle few scenarios like

  1. uri and potential escaped characters in it, like

    file:///C:/Test%20Project.exe -> C:\TEST PROJECT.EXE

  2. path segments specified by dots to denote current or parent directory

    c:\aaa\bbb\..\ccc -> C:\AAA\CCC

  3. tilde shortened (long) paths

    C:\Progra~1\ -> C:\PROGRAM FILES

  4. inconsistent directory delimiter character

    C:/Documents\abc.txt -> C:\DOCUMENTS\ABC.TXT

Other than those, it can ignore case, trailing \ directory delimiter character etc.

Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • 1
    Good and concise solution to path normalization, exactly what I was looking for. +1 – Syon Jan 16 '14 at 14:42
  • 33
    Do not use ToUpper() and friends for any code you want to be portable. There are case sensitive filesystems in the world. Also it's not so nice if you're showing these values to users, in which case you want to preserve case and use case-insensitive sorting and comparisons. Otherwise, looks good. – dhasenan Sep 20 '15 at 15:50
  • 2
    It depends on exactly what you mean by "canonical" but, since Windows treats file paths as case insensitive, I would argue that you *do* need a case conversion, otherwise it's possible for there to be more than one "canonical" path for the same file. I would prefer lower case though. – Andy Jun 15 '16 at 13:54
  • It doesn't work with relative paths. This way it does: private string NormalizePath(string path) { return path.Replace(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar) .ToUpperInvariant(); } – Mr.B Aug 23 '16 at 13:06
  • This was a great solution to the problem. Thank you for posting this. – Bill Aug 26 '16 at 20:21
  • 6
    @Andy: On the other hand, if one uses this variant of `NormalizePath` to copy or move a file to somewhere, she/he most probably expects the casing to not change. As a user, I would ban any such program which changes my carefully househeld naming systems. – Sebastian Mach Dec 06 '17 at 09:09
  • Maybe it would just be best to encapsulate raw path strings, and instead provide (or, preferably, use) some class which defines comparison operations and everything one needs. – Sebastian Mach Dec 06 '17 at 09:10
  • 1
    Because it really *does* matter: *don't change the naming casing! It is rude from a user-visual perspective, and sometimes downright problematic.* If there is a reason to do a 'canonical compare', that's the job of a string case-insensitive compare (possibly with a dictionary, or whatever): the CI access is handled *post-"canonical"* in the Windows Filesystem API! – user2864740 Jan 28 '18 at 21:03
  • It does not work with a drive letter without final antislash like C: (it is not a valid URI). Not sure it's a legal path, but definitively one that an user can type. – Maxence Feb 27 '18 at 07:53
  • Beware, the Uri class may encode characters such as `+`, spaces, and others. – mihca Aug 10 '23 at 07:11
60

Path.GetFullPath perhaps?

leppie
  • 115,091
  • 17
  • 196
  • 297
  • 4
    I do not believe this is guaranteed to return a canonical name. It only guarantees the name returned can be used to reference the file absolutely vs. relatively – JaredPar Aug 12 '09 at 14:52
  • 6
    Path.GetFullPath(@"c:\aaa\bbb\..\ccc") = c:\aaa\ccc - good enough for me. – mark Aug 12 '09 at 14:54
  • Correct, but it doesn't check if the path is valid. – H H Aug 12 '09 at 14:54
  • 12
    @Henk: Path utils should not actually check for a valid file, or even touch the file system (but there are a few cases it does). – leppie Aug 12 '09 at 14:58
  • Doesn't work for relative file paths, since a prefix like: 'C:/' will be added. – My-Name-Is Jun 18 '14 at 13:05
  • 1
    @My-Name-Is: That depend entirely on how you use it. – leppie Jun 20 '14 at 12:46
  • For this method there aren't any overloads. How should I use it differenty? – My-Name-Is Jun 20 '14 at 15:09
  • 1
    @My-Name-Is: That's what GetFullPath should do. NB Path.GetFullPath(@"\..\aaa") returns the nonsense "C:\..\aaa" whereas Path.GetFullPath(@"..\aaa") returns an absolute path relative to your Path.CurrentDirectory() – Chris F Carroll Jul 11 '14 at 12:22
  • Answer below takes care of case sensitivity. – kwesolowski Aug 15 '15 at 13:01
  • Note that [another bit of documentation](https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#path-normalization) does explicitly state that Path.GetFullPath normalizes, which includes Canonicalization. The documentation for Path.GetFullPath states that "if path does exist, the caller must have permission to obtain path information for path. Note that unlike most members of the Path class, this method accesses the file system." (the uri class mentioned in @bdukes does not access the filesystem). – Legolas Oct 30 '20 at 16:29
29

Canonicalization is one of the main responsibilities of the Uri class in .NET.

var path = @"c:\aaa\bbb\..\ccc";
var canonicalPath = new Uri(path).LocalPath; // c:\aaa\ccc
bdukes
  • 152,002
  • 23
  • 148
  • 175
  • So I assume this checks that the path actually exists? – ashes999 Dec 28 '11 at 21:22
  • 6
    No, the `Uri` class is only responsible for generating paths. The system against which those paths are relevant is not taken into account. Once you get the path via the method in my answer, you'd still need to check that it exists via the `File` class (or whatever). – bdukes Dec 29 '11 at 14:47
  • 8
    Note that still doesn't normalise drive letter case (e.g. "C:\" and "c:\" both come out unaltered). So this isn't really "canonical" in the sense of being unique, at any rate. – Alastair Maw Jun 16 '15 at 11:21
  • 3
    @AlastairMaw Since the Windows FS is CI, assuming a path is 'canonocial' then any other path differing in case-only IS canonical-and-equivalent *even with* casing differences. The consumer should also use CI string compares as relevant as all case-different forms *are* the same. – user2864740 Jan 28 '18 at 21:10
  • 1
    Beware, the URI class may encode characters such as `+`, spaces, and others. – mihca Aug 10 '23 at 07:10
1

FileInfo objects can also help here. (https://learn.microsoft.com/en-us/dotnet/api/system.io.fileinfo?view=net-5.0)

var x = Path.Combine(@"C:\temp", "..\\def/abc");
var y = new FileInfo(x).FullName; // "C:\\def\\abc"

FileInfo vs. DirectoryInfo can also help if you want to control the file vs. directory distinction.

But Path.GetFullPath is better if you just need the string.

Mike S
  • 3,058
  • 1
  • 22
  • 12