2

When I type url of a site to browser's address bar, browser sends a request to get the resource by the url. But when I go to different web sites (google.com, amazon.com, etc.), requests which initialize the page, have different headers for different sites.

Where browser gets the set of request's headers to load the page if browser has only information about URL of this resource at the first initialization?

for example when I go to google.com browser sends such request headers:

:authority: www.google.com
:method: GET
:path: /
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9,ru-RU;q=0.8,ru;q=0.7
cache-control: max-age=0
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: same-origin
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36

For amazon.com, the request's headers are different:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9,ru-RU;q=0.8,ru;q=0.7
Connection: keep-alive
Host: amazon.com
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36
Denys Koval
  • 112
  • 1
  • 10

1 Answers1

3

When you type in a URL into the address the bar this needs to be translated to an HTTP request.

So typing www.google.com means you need to GET the default page (/) from that server. That's basically all covered in the first 4 lines in the first request.

The browser also knows what types of format it can accept. Mostly we deliver HTML back so text/html is certainly in there, but we also accept other formats - including the completely generic */* btw! :-)

Requests are often compressed (with either gzip, deflate or the newer brotli (br) format) so the browser tells the server which of those it supports in the accept-encoding header.

When you installed your browser you also set a default language so we can tell the server that. Some servers will return different content based on this.

Then there are some security headers (I won't go into these as quite complicated).

Finally we have the user-agent header. this is basically where the browser tells the server whether it's Chrome, or Firefox or whatever. But for historical reasons it's much longer than just "Chrome".

So basically the request headers are things the browser sends to the server to give it more information about the browser and it's capabilities. For a request that's just typed into the browser the request headers will basically be the same no matter what the URL is. For additional requests made by the page - e.g. by JavaScript code they may be different if it adds more headers.

As to the differences between the two example requests you gave:

Google uses HTTP/2 (or QUIC if using Chrome but for now that's basically HTTP/2 as far as this question is concerned). you can see this if you add the option Protocol column to developer tools.

HTTP/2 has a couple of changes from HTTP/1, namely:

  • HTTP Header Names are lower cased. Technically in HTTP/1 they are case insensitive, but by convention many tools like browser used title case (capitalising first letter of each word).
  • The request (e.g. GET / HTTP/1.1) is converted to pseudo headers beginning with a colon (:method: GET, :path: /...etc.).
  • Host is basically :authority in HTTP/2.
  • :scheme is basically new in HTTP/2 as previously it wasn't explicitly part of the HTTP request, and handled at a connection level.
  • Connection is defunct in HTTP/2. Even in HTTP/1.1 it defaulted to keep-alive so above header was not necessary but lots of browsers and other clients sent it for historical reasons.

I think that explains all the differences.

So how does the browser know whether to use HTTP/2 or HTTP/1.1? Which already has an answer on Stack Overflow, but basically it's decided when the HTTPS session is established if the server advises it can support HTTP/2 and the browser wants to use it.

Barry Pollard
  • 40,655
  • 7
  • 76
  • 92
  • I mean, If I type only URL, how browser know the rest of information (request headers etc). Which mechanism allows browser to know the rest of the information except URL which is typed by user? Some DNS servers (but it just for getting IP), setting connection, some routers or what? Maybe some of the protocols of lower level sends this information? Which OSI level depends on it? – Denys Koval Jun 15 '20 at 14:30
  • If I cought correctly, TLS is in charge of choosing HTTP protocol and also TLS in charge of all other data of the protocol including request's header, isn't it? – Denys Koval Jun 15 '20 at 14:57
  • Added some more comments. Hopefully that answers your questions. – Barry Pollard Jun 15 '20 at 14:59
  • Okay, I see. And what about `cache-control: max-age=0` request-header field? Is it configured by HTTP/2 or by server through TLS (or not TLS but some another protocol)? Basically it was the reason why I got this question) – Denys Koval Jun 15 '20 at 15:02
  • No that will be because you refreshed the page for the first one (so browser said "Hey this guy wants me to check if this is the latest page, so don't send me any cached, old pages please") and you didn't for the second. If you refresh amazon.com you will see the same `cache-control: max-age=0` header. – Barry Pollard Jun 15 '20 at 15:08
  • Oh... I see. You are right) if I use f5 to refresh the page, browser adds the cache-control... Omg)) Thanks a lot. I thought that devs configure the server to say browser header's preset. You really helped me) – Denys Koval Jun 15 '20 at 15:11
  • No problem. Do try to give the question you are actually asking if possible. If that cache-control header was the one confusing you, then could have saved us both a lot of time if you'd asked that up front. Though hopefully you've learned quite a bit anyway with the full answer, but might not always get that - much more likely to get a quick answer if you've a quick, well-defined question! – Barry Pollard Jun 15 '20 at 15:17