2

I am looking for a simple method to encode/escape and decode/unescape file paths (illegal characters in file paths "\/?:<>*| )

HttpUtility.UrlEncode does its job, except it does not encode the * character.

All I could find was escaping with regex, or just replacing the illegal chars with _

I want to be able to encode/decode consistently.

I want to know if there's a pre-defined way to do that or I just need to write some code to encode and another piece to decode.

Thanks

Regis Portalez
  • 4,675
  • 1
  • 29
  • 41
Vicentiu B
  • 141
  • 1
  • 8
  • Do you have a web application? – Alex Feb 26 '13 at 11:06
  • Yes, it's in a web application, but I need it to write something to the disk. So I can use `HttpUtility` or other classes – Vicentiu B Feb 26 '13 at 11:07
  • you can also try `Uri.EscapeUriString` method. Otherwise it is more flexible to use regex - thus you will have full control on how to handle illegal characters – Nogard Feb 26 '13 at 11:08
  • Thanks Nogard, but `Uri.EscapeUriString` does not escape `?:/*` which are invalid path characters – Vicentiu B Feb 26 '13 at 11:09
  • there is no such function OOTO. You will have to stick with some custom solution. Using regex in this question may be best for you http://stackoverflow.com/questions/1032105/encoding-file-paths – Nogard Feb 26 '13 at 11:11
  • If the filename must not be directly readable within Explorer you could simply de-/encode to/from base64 like in [this example](http://arcanecode.com/2007/03/21/encoding-strings-to-base64-in-c/). – Oliver Feb 26 '13 at 11:48

3 Answers3

6

I've never tried anything like this before, so I threw this together:

static class PathEscaper
{
    static readonly string invalidChars = @"""\/?:<>*|";
    static readonly string escapeChar = "%";

    static readonly Regex escaper = new Regex(
        "[" + Regex.Escape(escapeChar + invalidChars) + "]",
        RegexOptions.Compiled);
    static readonly Regex unescaper = new Regex(
        Regex.Escape(escapeChar) + "([0-9A-Z]{4})",
        RegexOptions.Compiled);

    public static string Escape(string path)
    {
        return escaper.Replace(path,
            m => escapeChar + ((short)(m.Value[0])).ToString("X4"));
    }

    public static string Unescape(string path)
    {
        return unescaper.Replace(path,
            m => ((char)Convert.ToInt16(m.Groups[1].Value, 16)).ToString());
    }
}

It replaces any forbidden character with a % followed by its 16-bit representation in hex, and back. (You could probably get away with an 8-bit representation for the specific characters you have but I thought I'd err on the safe side.)

Rawling
  • 49,248
  • 7
  • 89
  • 127
4

Rawling's solution is good. But there is a small problem. The filename generated from Rawling's method may contain "%", which can cause some errors if you use this pathname as url. So, I change the escapeChar from "%" to "__" in order to make sure the generated filename is compatible with url convention.

static class PathEscaper
{
    static readonly string invalidChars = @"""\/?:<>*|";
    static readonly string escapeChar = "__";

    static readonly Regex escaper = new Regex(
        "[" + Regex.Escape(escapeChar + invalidChars) + "]",
        RegexOptions.Compiled);
    static readonly Regex unescaper = new Regex(
        Regex.Escape(escapeChar) + "([0-9A-Z]{4})",
        RegexOptions.Compiled);

    public static string Escape(string path)
    {
        return escaper.Replace(path,
            m => escapeChar + ((short)(m.Value[0])).ToString("X4"));
    }

    public static string Unescape(string path)
    {
        return unescaper.Replace(path,
            m => ((char)Convert.ToInt16(m.Groups[1].Value, 16)).ToString());
    }
}
-1

I've been using the following method for a while without a problem:

public static string SanitizeFileName(string filename) {
    string regex = String.Format(@"[{0}]+", Regex.Escape(new string(Path.GetInvalidFileNameChars())));
    return Regex.Replace(filename, regex, "_");
}
Brian Kintz
  • 1,983
  • 15
  • 19
  • Thanks, but as I stated in my question, I want to be able to decode - i.e. reconstruct the original string from the escaped string – Vicentiu B Feb 26 '13 at 11:15
  • 1
    In that case you will definitely need a custom solution...AFAIK there's nothing in .NET that offers en-/decode functionality – Brian Kintz Feb 26 '13 at 11:22