16

I noticed that both Chrome and Firefox ignore slashes between words in a URL.

So, github.com/octocat/hello-world seems to be equivalent to github.com//////octocat////hello-world.

I am writing an application that parses a URL and retrieves a part of it, and thanks to this behavior, I am able to return the original URL without modifying the code, which in my case is rather convenient. I don't know if it would be a good idea to rely on this quirk though.

Jorge Bucaran
  • 5,588
  • 2
  • 30
  • 48
  • 7
    You can always count on Wikipedia to have terrible URLs: [/](http://en.wikipedia.org/wiki//) [//](http://en.wikipedia.org/wiki///) [///](http://en.wikipedia.org/wiki////) – Kobi Dec 21 '14 at 13:51
  • 2
    "and thanks to this behavior, I am able to return the original URL without modifying the code," it's precisely because the browsers **don't** treat the URIs as equivalent that you can return the original URI. If browsers were treating different URIs as the same, how could you ever now what the original URI was? – Jon Hanna Dec 21 '14 at 16:29

4 Answers4

19

Path separators are defined to be a single slash according to this. (Search for Path Component)

Note that browsers don't usually modify the URL. Browsers could append a / at the end of a URL, but in your case, the URL with extra slashes is simply sent along in the request, so it is the server ignoring the slashes instead.

Also, have a look at:

Even if this behavior is convenient for you, it is generally not recommended. In addition, caching may also be affected (source):

Since both your browser and the server cache individual pages (according to their caching settings), requesting same file multiple times via slightly different URIs might affect the caching (depending on server and client implementation).

Community
  • 1
  • 1
Jorge Bucaran
  • 5,588
  • 2
  • 30
  • 48
  • `Browsers could append a / at the end of a URL` as far as I can tell, they don't do that. The only URL manipulations that browsers does is resolving relative URLs and when a URL is typed directly in the address bar some browsers might fix common typos. – Lie Ryan Dec 21 '14 at 23:51
  • 2
    A `/` is necessary to make a valid HTTP request. See [here](http://webmasters.stackexchange.com/questions/35643/is-trailing-slash-automagically-added-on-click-of-home-page-url-in-browser) for the long story. – Jorge Bucaran Dec 22 '14 at 00:06
  • Also found this article [`To slash or not to slash`](http://googlewebmastercentral.blogspot.jp/2010/04/to-slash-or-not-to-slash.html). – Jorge Bucaran Dec 22 '14 at 00:12
9

An empty path segment is valid as per specification:

path          = path-abempty    ; begins with "/" or is empty
              / path-absolute   ; begins with "/" but not "//"
              / path-noscheme   ; begins with a non-colon segment
              / path-rootless   ; begins with a segment
              / path-empty      ; zero characters

path-abempty  = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty    = 0<pchar>

segment       = *pchar
segment-nz    = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
              ; non-zero-length segment without any colon ":"

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

In the latter URI https://github.com//////octocat////hello-world, the path //////octocat////hello-world would be composed of:

  • //////octocat////hello-world: path-abempty
  • /: segment
  • /: segment
  • /: segment
  • /: segment
  • /: segment
  • /octocat: segment-nz
  • /: segment
  • /: segment
  • /: segment
  • /hello-world: segment-nz

Removing these empty path segments would make up a completely different URI. How the server would handle these empty path segments is a completely different question.

Community
  • 1
  • 1
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • Thank you! This answer is a bit too advanced for me though. I did understand the URL is handled by servers, not browsers though. –  Dec 21 '14 at 10:50
  • 1
    @Onarotrom It is handled by browsers as well. But they won’t change anything that changes the semantic intention of the URL. And removing empty path segments would change the semantic intention. – Gumbo Dec 21 '14 at 10:57
  • 1
    @Onarotrom For example, if you enter `https://github.com` into the browser’s address bar, the browser will append `/` to it: `https://github.com/`. Although it changes the semantics, you need a path for HTTP. – Gumbo Dec 21 '14 at 11:00
  • Your last comment is incorrect: an empty path component is equivalent to a `/` path component per this [RFC 3986 paragraph](https://tools.ietf.org/html/rfc3986#section-6.2.3): "For example, because the "http" scheme makes use of an authority component, has a default port of "80", and defines an empty path to be equivalent to "/", the following four URIs are equivalent: `http://example.com`, `http://example.com/`, `http://example.com:/`, `http://example.com:80/` In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/"." – Géry Ogam Aug 29 '19 at 14:02
6

Actually browsers do not ignore them, they pass them to the web server in the HTTP request. It's the server that may decide to ignore them, but technically multiplying slashes results in a different URL.

W3.org specifies that the path part of a URL consists of "path segments", separated by /, and a path segment consists of zero or more "URL units" (characters) except / and ?, so empty path segments are allowed, which is what you get when you duplicate slashes.

See http://www.w3.org/TR/url-1/ for details

Wormbo
  • 4,978
  • 2
  • 21
  • 41
3

Actually browsers do not ignore slashes between URLs.

If you use document.URL in (client side) JavaScript you get the URL with the repeating '///'s.

Similarly in (server side) PHP, when using $_SERVER['REQUEST_URI'] you get the URL with the repeating '///'s.

It is the server, e.g., Apache, that actually redirects to the proper page without URL. In Apache you can write rules in the .htaccess file to not redirect to the page with ///s ignored.

Jorge Bucaran
  • 5,588
  • 2
  • 30
  • 48