1

Take this example:

var client = new HttpClient();
await client.GetAsync("http://www.google.com?q=%2D");

this actually sends a request to 'http://www.google.com?q=-'. I don't want .NET to alter my url.

This behavior is from System.Uri, which seems to unescape those character.

How can I prevent Uri/HttpClient from changing my url?

.NET Framework 4.7.2

Update: this behavior seems by design. I still can't believe there is not a way around this. What if I actually want to send '?q=what does %2D mean' to google.com? Now this gets send as 'http://www.google.com/?q=what%20does%20-%20mean'. Which is NOT what I meant to do.

Remco Ros
  • 1,467
  • 15
  • 31
  • 2
    Out of curiosity, and possibly relevant to the overall question, where specifically are you observing the altered URL and what problem is it causing? – David Oct 28 '19 at 13:26
  • Also, what version of .NET are you using? Some .NET Framework, .NET Core, Mono or some other implementation? – fredrik Oct 28 '19 at 13:28
  • I'm observing this using Fiddler when working directly with Uri or indirectly trough HttpClient. I need to send %2D to an external API (out of our control) – Remco Ros Oct 28 '19 at 13:30
  • How have you determined that it is HttpClient or Uri that are misbehaving? Could it be Fiddler? – fredrik Oct 28 '19 at 13:33
  • @RemcoRos: It *really shouldn't* be causing a problem, but 3rd party APIs aren't always the friendliest. As a possible workaround, what if you send the value `%252D` (which is the URL-encoded `%2D`, basically double-encoding before sending)? – David Oct 28 '19 at 13:34
  • @david Already tried that :-) it gets send as ?q=%252D (weird that here %25 is not unescaped to %...) – Remco Ros Oct 28 '19 at 13:36
  • % is not an escaped character. Please read [this](https://learn.microsoft.com/en-us/dotnet/api/system.uri?view=netframework-4.8) article, specifically section 'remarks'. Among other things it says that you can use application config file to configure the Uri class behaviour – Fabjan Oct 28 '19 at 13:50
  • I tried: new Uri("file:///root/%2D/foo"). This gets translated to file:///root/-/foo..... how does that even work, because that is actually a different directory.... I want Uri to NOT touch my url. Or be able to send to a raw url using HttpClient. But since HttpClient uses Uri everywhere, I cannot get around this....... I can't believe there is not a good way to use "%2D" as a literal value in an url. – Remco Ros Oct 28 '19 at 13:54
  • So according to https://learn.microsoft.com/en-us/dotnet/api/system.uri?view=netframework-4.8 this behavior is by design. Any workarounds? I NEED to encode hyphens... (don't ask, I have no control over the external API) – Remco Ros Oct 28 '19 at 14:00
  • If it's by design and there are no options available to override it, afaik your only option is to use another library for the request. – fredrik Oct 28 '19 at 14:03
  • Any suggestions for a library? the old WebClient also uses Uri's and I can imagine any other library will as well. – Remco Ros Oct 28 '19 at 14:38

1 Answers1

1

Possible partial solution based on reflection.

I think the problem is that - is listed as a special character, here: https://referencesource.microsoft.com/#System/net/System/UriHelper.cs,657 . I don't think there's a way to modify the http scheme to change that behavior.

There was a previous bug, which has since been fixed, relating to how file paths are parsed by Uri. At the time, the workaround was to change the private flags of the related UriParser using reflection : https://stackoverflow.com/a/2285321/1462295

Here is a quick demo which you'll have to evaluate if it helps or not. It depends on whether uri.ToString() is called (then this might help), or uri.GetComponents (then you'll have to figure something else out). This code reaches into the Uri object and replaces the parsed string with something else. Here's the code and console output:

static void Main(string[] args)
{
    var surl = "http://www.google.com?q=%2D";

    var url = new Uri(surl);
    Console.WriteLine("Broken: " + url.ToString());

    // Declared in Uri class as
    //     private UriInfo     m_Info;
    // https://referencesource.microsoft.com/#System/net/System/URI.cs,129
    FieldInfo infoField = url.GetType().GetField("m_Info", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);

    // Immediately after m_Info is declared, the private class definition is given:
    //     private class UriInfo {
    //         public string   String;
    //         ...
    //     };
    object info = infoField.GetValue(url);
    FieldInfo infoStringField = info.GetType().GetField("String");

    // If you check the value of m_Info.String, you'll see it has the
    // modified string with '?q=-'.
    // The idea with this block of code is to replace the parsed
    // string with the one that you want.
    // This just replaces the string with the original value.
    infoStringField.SetValue(info, surl);

    // ToString() @ https://referencesource.microsoft.com/#System/net/System/URI.cs,1661
    // There are a couple of 'if' branches, but the last line is
    //     return m_Info.String;
    // This is the idea behind the above code.
    Console.WriteLine("Fixed: " + url.ToString());

    // However, GetComponents uses entirely different logic:
    Console.WriteLine($"Still broken: {url.GetComponents(UriComponents.AbsoluteUri, UriFormat.Unescaped)}");
    Console.WriteLine($"Still broken: {url.GetComponents(UriComponents.AbsoluteUri, UriFormat.SafeUnescaped)}");
    Console.WriteLine($"Still broken: {url.GetComponents(UriComponents.AbsoluteUri, UriFormat.UriEscaped)}");

    Console.WriteLine("Press ENTER to exit ...");
    Console.ReadLine();
}

Console output:

Broken: http://www.google.com/?q=-
Fixed: http://www.google.com?q=%2D
Still broken: http://www.google.com/?q=-
Still broken: http://www.google.com/?q=-
Still broken: http://www.google.com/?q=-
Press ENTER to exit ...

You might find some other inspiration from the code here which does use reflection, but also defines its own scheme to work with. Note the trust issues mentioned.

You mention .Net Framework 4.7.2, which should work with the above code. dotnet core will not.

BurnsBA
  • 4,347
  • 27
  • 39
  • Thanks for the input, this seems to work when using Url.ToString(). Not a complete solution unfortunately, because under the hood of HttpClient GetComponents or AbsoluteUri property is used (url.ToString() is not recommended to be used). Accepting this as an answer as it's closest to a possible solution. Thanks! – Remco Ros Nov 08 '19 at 09:25