3

I got a URL which contains parameters, one of which is with Cyrillic letters.

http://localhost/Print.aspx?id=4&subwebid=243572&docnumber=%u0417%u041f005637-1&deliverypoint=4630013519990

Doc-number must be ЗП005637-1. I have tried the following code, but string is still with those characters %u0417%u041f.

public static String DecodeUrlString(this String url)
    {
        String newUrl;
        while ((newUrl = Uri.UnescapeDataString(url)) != url)
            url = newUrl;
        return newUrl;
    }

It's not a possibility to use HttpUtility.

Wandering Fool
  • 2,170
  • 3
  • 18
  • 48
pad0n
  • 187
  • 3
  • 17
  • 1
    So use the code *from* HttpUtility; http://referencesource.microsoft.com/#System.Web/Util/HttpEncoder.cs,afac0d4e31f5382a – Alex K. Aug 10 '15 at 15:34
  • possible duplicate of [Alternative to HttpUtility.ParseQueryString without System.Web dependency?](http://stackoverflow.com/questions/27442985/alternative-to-httputility-parsequerystring-without-system-web-dependency) – Tim Rogers Aug 10 '15 at 15:40

1 Answers1

1

If your goal is to avoid a dependency on System.Web.dll, then you would normally use the equivalent method in the WebUtility Class: WebUtility.UrlDecode Method.

However, you will find that, even then, your url won't get decoded the way you want it to.

This is because WebUtility.UrlDecode does not handle the %uNNNN escape notation on purpose. Notice this comment in the source code:

// *** Source: alm/tfs_core/Framework/Common/UriUtility/HttpUtility.cs
// This specific code was copied from above ASP.NET codebase.
// Changes done - Removed the logic to handle %Uxxxx as it is not standards compliant.

As stated in the comment, the %uNNNN escape format is not standard compliant and should be avoided if possible. You can find more info on this and on the proper way of encoding urls from this thread.

If you have any control over how the url is generated, consider changing it to be standard-compliant. Otherwise, consider adding System.Web.dll as a dependency, find another third-party library that does the job, or write your own decoder. As commented already, the source code is out there.

Community
  • 1
  • 1
sstan
  • 35,425
  • 6
  • 48
  • 66
  • 1
    +1 for the last paragraph. `%uNNNN` is usually a sign that something on the client side is using the deprecated JavaScript `escape()` function. This doesn't produce valid URLs: not only does the `%u` escape come out for Unicode characters, but the `%` escapes for 0x80–0xFF are wrong too. So by far the best fix is to stop using `escape()` and go to `encodeURIComponent()` instead. – bobince Aug 11 '15 at 10:14