5

When I make a request to http://www.example.com, why does I see http://www.example.com/ in the webRequest.onBeforeRequestListener?

For example:

chrome.webRequest.onBeforeRequest.addListener(
  details => console.log('Sending request to', details.url),
  { urls: ['<all_urls>'] });
fetch('http://www.example.com');

will print

Sending request to http://www.example.com/

That is consistent with the request URL shown in the network request monitor. For example, if I take it and convert it to a curl command, the request looks like this:

curl 'http://www.example.com/' -H 'Accept: */*' -H 'Connection: keep-alive'
    -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: en-US,en;q=0.9'
    -H 'User-Agent: ...' --compressed

So, the original request that goes out is for http://www.example.com/ not for http://www.example.com. That decision must have been made in the browser, not by the server.

The same behavior also occurs when using XMLHttpRequest instead of fetch. In my example, I used Chrome, but on Firefox it is the same.

Questions:

  • Why does the browser change it automatically? It also happens with other URLs. From my understanding, adding a trailing slash will often work, but in general, it is a breaking change.
  • If I want to filter in the onBeforeRequest listener for the current request to a specific URL, how can you reliably match it? For instance, just checking whether the URLs are identical will fail.
  • Are there more rewrite URL rules in the browser to be aware of?
Philipp Claßen
  • 41,306
  • 31
  • 146
  • 239
  • 1
    Some other answers regarding the trailing slash: https://stackoverflow.com/questions/2581411/do-web-browsers-always-send-a-trailing-slash-after-a-domain-name, https://webmasters.stackexchange.com/questions/35643/is-trailing-slash-automagically-added-on-click-of-home-page-url-in-browser – Luka Čelebić Nov 17 '17 at 17:46
  • @PredatorIWD Thanks for the link. That seems to confirm what I wrote in my answer. – Philipp Claßen Nov 17 '17 at 18:42

1 Answers1

5

Think, I found it. The browser is just fixing an invalid URL.

To cite from Wikipedia, a URL looks like this:

scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]

The path must begin with a single slash (/) if an authority part was present, and may also if one was not, but must not begin with a double slash. The path is always defined, though the defined path may be empty (zero length), therefore no trailing slash.

http://example.com has an authority part (in this example, the schema plus hostname: http://example.com), but that leaves the path empty. According to the specification, the path must start with a /, so the browser fixes it by replacing the empty path by /.

If you use a valid URL instead, like http://example.com/abc, it does not need to modify it.

Philipp Claßen
  • 41,306
  • 31
  • 146
  • 239
  • AFAIR this comes from HTTP RFCs. And here is the relevant quote https://stackoverflow.com/a/2581514/529442 – Eugene Gr. Philippov Jan 14 '21 at 17:09
  • @Philip, I don't think this is fixing URL, Kindly look at this question posted by me, here "/" is not getting added by itself: https://stackoverflow.com/questions/76233565/need-to-route-to-specific-page-if-i-enter-url-with-path – Sup Ravi Kumar May 12 '23 at 06:48
  • 1
    @SupRaviKumar There is a difference between a technical invalid URL like `http://example.com` and a valid URL like `http://example.com/abc`. Only in the first case, the "/" will be added (as described in my answer). In your linked question, it covers the second case. The browser will not change it since the URL is well-formed and changing it could change its meaning. – Philipp Claßen May 12 '23 at 12:24