18

When the .NET System.Uri class parses strings it performs some normalization on the input, such as lower-casing the scheme and hostname. It also trims trailing periods from each path segment. This latter feature is fatal to OpenID applications because some OpenIDs (like those issued from Yahoo) include base64 encoded path segments which may end with a period.

How can I disable this period-trimming behavior of the Uri class?

Registering my own scheme using UriParser.Register with a parser initialized with GenericUriParserOptions.DontCompressPath avoids the period trimming, and some other operations that are also undesirable for OpenID. But I cannot register a new parser for existing schemes like HTTP and HTTPS, which I must do for OpenIDs.

Another approach I tried was registering my own new scheme, and programming the custom parser to change the scheme back to the standard HTTP(s) schemes as part of parsing:

public class MyUriParser : GenericUriParser
{
    private string actualScheme;

    public MyUriParser(string actualScheme)
        : base(GenericUriParserOptions.DontCompressPath)
    {
        this.actualScheme = actualScheme.ToLowerInvariant();
    }

    protected override string GetComponents(Uri uri, UriComponents components, UriFormat format)
    {
        string result = base.GetComponents(uri, components, format);

        // Substitute our actual desired scheme in the string if it's in there.
        if ((components & UriComponents.Scheme) != 0)
        {
            string registeredScheme = base.GetComponents(uri, UriComponents.Scheme, format);
            result = this.actualScheme + result.Substring(registeredScheme.Length);
        }

        return result;
    }
}

class Program
{
    static void Main(string[] args)
    {
        UriParser.Register(new MyUriParser("http"), "httpx", 80);
        UriParser.Register(new MyUriParser("https"), "httpsx", 443);
        Uri z = new Uri("httpsx://me.yahoo.com/b./c.#adf");
        var req = (HttpWebRequest)WebRequest.Create(z);
        req.GetResponse();
    }
}

This actually almost works. The Uri instance reports https instead of httpsx everywhere -- except the Uri.Scheme property itself. That's a problem when you pass this Uri instance to the HttpWebRequest to send a request to this address. Apparently it checks the Scheme property and doesn't recognize it as 'https' because it just sends plaintext to the 443 port instead of SSL.

I'm happy for any solution that:

  1. Preserves trailing periods in path segments in Uri.Path
  2. Includes these periods in outgoing HTTP requests.
  3. Ideally works with under ASP.NET medium trust (but not absolutely necessary).
Andrew Arnott
  • 80,040
  • 26
  • 132
  • 171
  • 2
    Would be easier if your sample code was a failing unit test to illustrate what the problem is. – Simon Mar 25 '10 at 21:33
  • The unit test would have to set up an HTTPS web server to prove the failure. :( – Andrew Arnott Mar 26 '10 at 00:09
  • Did you ever get this resolved successfully? Do you still need help with this? – jcolebrand Dec 14 '10 at 04:24
  • Yes, through a complex mixture of reflection (when full trust is available) and other magic when full trust is not available. Much of the code in https://github.com/AArnott/dotnetopenid/blob/v3.4/src/DotNetOpenAuth/OpenId/UriIdentifier.cs is dedicated to playing all the tricks necessary to get 99% of the right behavior. – Andrew Arnott Dec 14 '10 at 07:14

4 Answers4

5

Microsoft says it will be fixed in .NET 4.0 (though it appears from the comments that it has not been fixed yet)

https://connect.microsoft.com/VisualStudio/feedback/details/386695/system-uri-incorrectly-strips-trailing-dots?wa=wsignin1.0#tabs

There is a workaround on that page, however. It involves using reflection to change the options though, so it may not meet the medium trust requirement. Just scroll to the bottom and click on the "Workarounds" tab.

Thanks to jxdavis and Google for this answer:

http://social.msdn.microsoft.com/Forums/en-US/netfxbcl/thread/5206beca-071f-485d-a2bd-657d635239c9

Jeff Atwood
  • 63,320
  • 48
  • 150
  • 153
Maxx Daymon
  • 468
  • 3
  • 11
  • 1
    The MS Connect bug is out of date, unfortunately. The .NET team has told me directly that .NET 4.0 does not fix the dot bug. But the workaround is interesting. Thanks. – Andrew Arnott Mar 26 '10 at 00:16
  • Dead link (connect.microsoft.com). Second link points to https://stackoverflow.com/a/2285321/1462295 at the bottom of the page. – BurnsBA Oct 28 '19 at 16:41
2

I'm curious if part of the problem is that you are only accounting for "don't compress path", instead of all the defaults of the base HTTP parser: (including UnEscapeDotsAndSlashes)

  private const UriSyntaxFlags HttpSyntaxFlags = (UriSyntaxFlags.AllowIriParsing | UriSyntaxFlags.AllowIdn | UriSyntaxFlags.UnEscapeDotsAndSlashes | UriSyntaxFlags.CanonicalizeAsFilePath | UriSyntaxFlags.CompressPath | UriSyntaxFlags.ConvertPathSlashes | UriSyntaxFlags.PathIsRooted | UriSyntaxFlags.AllowAnInternetHost | UriSyntaxFlags.AllowUncHost | UriSyntaxFlags.MayHaveFragment | UriSyntaxFlags.MayHaveQuery | UriSyntaxFlags.MayHavePath | UriSyntaxFlags.MayHavePort | UriSyntaxFlags.MayHaveUserInfo | UriSyntaxFlags.MustHaveAuthority);

That's as opposed to the news that has flags (for instance):

 private const UriSyntaxFlags NewsSyntaxFlags = (UriSyntaxFlags.AllowIriParsing | UriSyntaxFlags.MayHaveFragment | UriSyntaxFlags.MayHavePath);

Dang, Brandon Black beat me to it while I was working on typing things up...

This may help with code readability:

namespace System 
{
    [Flags]
    internal enum UriSyntaxFlags
    {
        AllowAnInternetHost = 0xe00,
        AllowAnyOtherHost = 0x1000,
        AllowDnsHost = 0x200,
        AllowDOSPath = 0x100000,
        AllowEmptyHost = 0x80,
        AllowIdn = 0x4000000,
        AllowIPv4Host = 0x400,
        AllowIPv6Host = 0x800,
        AllowIriParsing = 0x10000000,
        AllowUncHost = 0x100,
        BuiltInSyntax = 0x40000,
        CanonicalizeAsFilePath = 0x1000000,
        CompressPath = 0x800000,
        ConvertPathSlashes = 0x400000,
        FileLikeUri = 0x2000,
        MailToLikeUri = 0x4000,
        MayHaveFragment = 0x40,
        MayHavePath = 0x10,
        MayHavePort = 8,
        MayHaveQuery = 0x20,
        MayHaveUserInfo = 4,
        MustHaveAuthority = 1,
        OptionalAuthority = 2,
        ParserSchemeOnly = 0x80000,
        PathIsRooted = 0x200000,
        SimpleUserSyntax = 0x20000,
        UnEscapeDotsAndSlashes = 0x2000000,
        V1_UnknownUri = 0x10000
    }
}
jcolebrand
  • 15,889
  • 12
  • 75
  • 121
1

You should be able to precent escape the '.' using '%2E', but that's the cheap and dirty way out.

You might try playing around with the dontEscape option a bit and it may change how Uri is treating those characters.

More info here: http://msdn.microsoft.com/en-us/library/system.uri.aspx

Also check out the following (see DontUnescapePathDotsAndSlashes): http:// msdn.microsoft.com/en-us/library/system.genericuriparseroptions.aspx

Brandon Black
  • 877
  • 5
  • 14
  • Thanks, Brandon. The `DontUnescapePathDotsAndSlashes` option is one possible workaround, although to work effectively it needs to be applied to the existing HTTP and HTTPS parsers, which is only possible in .NET 4.0 (unless you use reflection as has been suggested in other answers here). – Andrew Arnott Mar 26 '10 at 00:18
1

Does this work?

public class MyUriParser : UriParser
{
private string actualScheme;

public MyUriParser(string actualScheme)
{
    Type type = this.GetType();
    FieldInfo fInfo = type.BaseType.GetField("m_Flags", BindingFlags.Instance | BindingFlags.NonPublic);
    fInfo.SetValue(this, GenericUriParserOptions.DontCompressPath);
    this.actualScheme = actualScheme.ToLowerInvariant();
}

protected override string GetComponents(Uri uri, UriComponents components, UriFormat format)
{
    string result = base.GetComponents(uri, components, format);

    // Substitute our actual desired scheme in the string if it's in there. 
    if ((components & UriComponents.Scheme) != 0)
    {
        string registeredScheme = base.GetComponents(uri, UriComponents.Scheme, format);
        result = this.actualScheme + result.Substring(registeredScheme.Length);
    }

    return result;
}}
Raj Kaimal
  • 8,304
  • 27
  • 18