3

I have recently been told by a coworker that the query string of an HTTPS GET request is visible to third parties, and I set out to prove him wrong. But finding any explicit description of URL parsing has been difficult.

My understanding has been that the URL is only sent piecemeal, with the domain passed into the IP header, the port passed into the TCP header, etc. In the particular case of an HTTPS GET, this would mean that the query string will only reside in the HTTP header, which in turn resides in the TLS body, which is end-to-end encrypted and therefore safe.

My question, then, is twofold:

  • First, am I right about the particular case of an HTTPS GET query string?
  • Second, can anyone provide me with a general anatomy of a URL with an eye toward how its parts translate into a TCP/IP request?
Colin P. Hill
  • 422
  • 4
  • 18

2 Answers2

6

Firstly, you are right about the query string being protected during transit when using HTTPS. There are already a number of questions about this, for example this one. Essentially, HTTPS is HTTP over SSL/TLS, so the SSL/TLS connection is set up before any HTTPS traffic is sent. (What's possibly visible is the host name, either in the Server Name Indication extension of TLS, or this can be leaked by DNS requests anyway.)

Secondly, when you make a GET request to https://host.example:port/something?blabla=1, this is an overview of what happens:

  • Your browser establishes a TCP connection to host.example on that port.

  • Since it's an https:// URL, an SSL/TLS connection is established. The SSL/TLS stack should verify the certificate and that it matches the host name you're after.

  • On top of this SSL/TLS connection (that would be directly on top of the TCP connection when using plain HTTP), your browser sends something like this:

      GET /something?blabla=1 HTTP/1.1
      Host: host.example:port
      .... (other headers)
    

    All this is sent over SSL/TLS. Note that strictly speaking the query parameters are integral part of the URL and are sent in the Request-Line, not headers.

You can find details about HTTP in RFC 2616 (recently superseded) and about HTTPS in RFC 2818.

Community
  • 1
  • 1
Bruno
  • 119,590
  • 31
  • 270
  • 376
0

IP headers would not use a URL or domain or anything like that, the TCP they encapsulate would though. In this case everything that's encapsulated is encrypted.

I think it's worth pointing out that whereas you may not see much of a given HTTPS message, a message such as

GET https://example.com/query?q=text

I believe that you would see a DNS request for 'example' in the clear; everything else would be encrypted in the HTTPS. All this to say don't forget about encryption during the host resolution step. It's not in the IP header, that's what IP addresses are for. But we have go get those addresses via DNS.

Brad Larson
  • 170,088
  • 45
  • 397
  • 571