8

When the URL http:///example.org is opened in Firefox or webkit-based browsers, it opens http://example.org. I wonder if this is a valid behavior, i.e. if the extra slash should be stripped and example.org treated as an authority component. I read the specification (RFC 3986), and I got the impression that the authority component of such an URI should be considered empty. Some other HTTP clients such as curl or links2 won't resolve the URL.

Is this a bug in the browsers, or a valid behavior in accordance with the RFC? Edit: Or an intended feature, in order to make browsers more user-friendly?

Community
  • 1
  • 1
peter
  • 480
  • 3
  • 7
  • "Is this a bug in the browsers, or is this a valid behaviour in accordance with the RFC?" - you know it's possible to be neither.. if a user accidentally types an extra slash, I think they would rather have the browser remove it for them than have a browser that strictly enforces RFC 3986. – Blorgbeard Mar 31 '14 at 21:54
  • Maybe because it is a standard in the RFC that this behavior was implemented? I could imagine a scenario in which a user accidentally types one too many slashes, and perhaps Firefox knows that according to the RFC, only a double slash may precede an authority, and changes it accordingly. – Justin C Mar 31 '14 at 21:54
  • 1
    Yes, your are right, this is also a possibility. In that case I was worried about the security implications of this for other programs, which expect the RFC to be followed. – peter Mar 31 '14 at 21:57

1 Answers1

8

The specification of the "http" protocol requires a hostname in the URI. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.2. So the string http:///foo is not a valid http URI, and the browser is faced with the question of what to do with the invalid URI string.

What Gecko (Firefox) does is that its URI parser actually has scheme-dependent behavior where it will assume what you meant based on the URI scheme and do certain fixups. See the comments at http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIStandardURL.idl?rev=f4157e8c4107&mark=20-23,28-31,36-39#20. "http" URIs are created with the URLTYPE_AUTHORITY flag, which leads to the behavior you see (per line 31 of nsIStandardURL.idl).

Note that the current attempt to standardize how URIs should be parsed in web pages and by web browsers, at http://url.spec.whatwg.org/ and has a whitelist of schemes at http://url.spec.whatwg.org/#relative-scheme that have behavior like this. If you step through the parsing algorithm for schemes in that whitelist, once you see the ':' you enter the state at http://url.spec.whatwg.org/#authority-first-slash-state which basically treats 0 or more slashes as all being equivalent to "//" and goes on to parse the thing following the slashes as the "authority" section of the URL.

Boris Zbarsky
  • 34,758
  • 5
  • 52
  • 55