20

How do you properly encode a path that includes a hash (#) in it? Note the hash is not the fragment (bookmark?) indicator but part of the path name.

For example, if there is a path like this:

http://www.contoso.com/code/c#/somecode.cs

It causes problems when you for example try do this:

Uri myUri = new Uri("http://www.contoso.com/code/c#/somecode.cs");

It would seem that it interprets the hash as the fragment indicator.

It feels wrong to manually replace # with %23. Are there other characters that should be replaced? There are some escaping methods in Uri and HttpUtility but none seem to do the trick.

Dodgyrabbit
  • 3,107
  • 3
  • 26
  • 28

3 Answers3

8

There are a few characters you are not supposed to use. You can try to work your way through this very dry documentation, or refer to this handy URL summary on Stack Overflow.

If you check out this very website, you'll see that their C# questions are encoded %23.

Stack Overflow C# Questions

You can do this using either (for ASP.NET):

string.Format("http://www.contoso.com/code/{0}/somecode.cs", 
    Server.UrlEncode("c#")
);

Or for class libraries / desktop:

string.Format("http://www.contoso.com/code/{0}/somecode.cs",
    HttpUtility.UrlEncode("c#")
);
Community
  • 1
  • 1
Fenton
  • 241,084
  • 71
  • 387
  • 401
  • Please note that encoding the entire URL this way (including scheme and path) actually yields a string that is not a valid URI. It throws Invalid URI: The format of the URI could not be determined if you try constructing a URI from it. – Dodgyrabbit Feb 17 '12 at 02:53
  • I think the point is missed here. It's not *just* about the # character, but rather, how to construct an arbitrary URI with a valid path that might contain any of the invalid characters. – Dodgyrabbit Feb 17 '12 at 15:15
  • @Dodgyrabbit given that we don't know which parts of the URL are dynamic, I have left it to the OP to decide where it is appropriate to use the utility class. – Fenton Feb 17 '12 at 15:32
  • I dont think that System.Web.HttpUtility.UrlEncode gets rid of the # sign, i just tried it and its still there, this truncating the query at the server side. – Ted Oct 11 '12 at 18:43
  • @Ted it results in C%23 - how were you testing the value? – Fenton Oct 12 '12 at 08:11
  • Hey, sorry for late reply. I tried a C# client (WebClient), but when received at the server it was truncated after where the # is. However, if I look at the string on the client side, the # is indeed "gone", but .NET on the server-side seem to UrlDecode it "behind the scenes" and then it truncates it before the code is executed (the C# code, implementing the method) – Ted Oct 15 '12 at 15:57
6

Did some more digging friends and found a duplicate question for Java: HTTP URL Address Encoding in Java

However, the .Net Uri class does not offer the constructor we need, but the UriBuilder does.

So, in order to construct a proper URI where the path contains illegal characters, do this:

// Build Uri by explicitly specifying the constituent parts. This way, the hash is not confused with fragment identifier
UriBuilder uriBuilder = new UriBuilder("http", "www.contoso.com", 80, "/code/c#/somecode.cs");

Debug.WriteLine(uriBuilder.Uri);
// This outputs: http://www.contoso.com/code/c%23/somecode.cs

Notice how it does not unnecessarily escape parts of the URI that does not need escaping (like the :// part) which is the case with HttpUtility.UrlEncode. It would seem that the purpose of this class is actually to encode the querystring/fragment part of the URL - not the scheme or hostname.

Community
  • 1
  • 1
Dodgyrabbit
  • 3,107
  • 3
  • 26
  • 28
3

Use UrlEncode: System.Web.HttpUtility.UrlEncode(string)

class Program
{
    static void Main(string[] args)
    {
        string url = "http://www.contoso.com/code/c#/somecode.cs";
        string enc = HttpUtility.UrlEncode(url);

        Console.WriteLine("Original: {0} ... Encoded {1}", url, enc);
        Console.ReadLine();
    }
}
LiquidPony
  • 2,188
  • 1
  • 17
  • 19
  • The string enc in the example, although escaped, is no longer a valid URI. Try Uri uri = new Uri(enc) and you'll see it throws Invalid URI exception. Found the correct solution though. – Dodgyrabbit Feb 17 '12 at 02:57
  • Interesting. I'm a bit confused by the MSDN documentation, then, which says _Encodes a URL string. The UrlEncode method can be used to encode the entire URL, including query-string values._ (see http://msdn.microsoft.com/en-us/library/system.web.httputility.urlencode.aspx) – LiquidPony Feb 17 '12 at 19:29