4

Suppose I have the following link tag: <a href="tel:+15555555">Phone number</a>.

How exactly does the browser know not to load the relative location ./tel:+15555555 from the current server and instead know that tel is supposed to be interpreted as a scheme?

Detecting host-relative URLs (/…) or protocol-relative URLs (//…) seems to be trivial. I guess HTTP-URLs (http://… or https://…) would be simple to special-case as well. But how does the browser go about parsing an URL with an arbitrary scheme? I know a valid scheme has to start with a lowercase letter and may only contain lowercase letters or the characters +, - and ., which limits the scope somewhat… Of course I’m aware that the whole issue only pertains to scopes where relative URLs are valid (i.e. mostly the href and src attributes).

I’m looking for the links to some RFC (e.g. which forbids non-URL-encoded colons to be anything but scheme separators) as well as to the source code of various browser’s URL parsing internals.

Raphael Schweikert
  • 18,244
  • 6
  • 55
  • 75
  • @Pumbaa80: Colons are valid in any part of a URL. I’m looking for a spec that forbids them in relative URLs or when used unescaped: A Link to `special:Random` from http://en.wikipedia.org/wiki/ECMAScript could ambiguously link to both http://en.wikipedia.org/wiki/special:Random as well as the `Random` URL of the `special:` scheme. I want to know WHERE EXACTLY this ambiguity is resolved (spec or source code). – Raphael Schweikert Feb 08 '13 at 08:15
  • 4
    It cannot be both absolute URI and relative path. Here is what [RFC3986](http://tools.ietf.org/html/rfc3986#section-4.2) says: "A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative- path reference." – user123444555621 Feb 08 '13 at 08:33
  • @Pumbaa80 could you add this comment as an answer and I will accept it (unless someone is willing to search through WebKit and Mozilla source codes, in which case they would deserve to get the accepted answer). – Raphael Schweikert Feb 08 '13 at 08:47

2 Answers2

3

The href value is parsed as a URI (see RFC 3986). As a result of the parsing, the browser will know that this was an absolute URI, not a relative reference.

As a matter of fact, unescaped ":" is allowed in the path component; it's just that they need to occur after the first "/"; otherwise they could be parsed as scheme delimiter if the preceding characters are all valid scheme name characters.

See http://greenbytes.de/tech/webdav/rfc3986.html#path

The RFC also has the following to say in section 4.2 (titled “Relative Reference”): “A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative-path reference.” (emphasis added).

Community
  • 1
  • 1
Julian Reschke
  • 40,156
  • 8
  • 95
  • 98
  • So the relative reference is only tried if the parsing as absolute URI fails? – Raphael Schweikert Feb 08 '13 at 08:27
  • 1
    @Raphael Schweikert: Correct. There's a section in [the HTML5 spec](http://www.w3.org/TR/html5/infrastructure.html#parsing-urls) that basically sets in stone what most if not all browsers already do. – BoltClock Feb 08 '13 at 08:28
  • 1
    Again, what's relevant is RFC 3986. That contains an ABNF (augmented Backus-Naur Form) that applies to all references, being absolute or not. The result of parsing will tell you whether it was absolute or not. See http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.4.1 – Julian Reschke Feb 08 '13 at 09:05
0

See RFC 3966 for the tel URI specification, and RFC 3986 for the more generic URL specification. It's the colon (:) that separates scheme from the "hier part".

Community
  • 1
  • 1
CodeCaster
  • 147,647
  • 23
  • 218
  • 272
  • Thank you. But this does not really answer my question. I was looking for a much more generic answer: even if browser vendors special-cased `tel:` links (as they most likely do with `http(s):`, there are many other schemes which open in external applications, none of which are described in any RFC… – Raphael Schweikert Feb 08 '13 at 08:12