15

I am writing an HTTP proxy and I am having trouble understanding some details of making a CONNECT request over TLS. To get a better picture, I am experimenting with Apache to observe how it interacts with clients. This is from my default virtual host.

NameVirtualHost *:443
<VirtualHost>
  ServerName example.com
  DocumentRoot htdocs/example.com  
  ProxyRequests On
  AllowConnect 22
  SSLEngine on
  SSLCertificateFile /root/ssl/example.com-startssl.pem
  SSLCertificateKeyFile /root/ssl/example.com-startssl.key
  SSLCertificateChainFile /root/ssl/sub.class1.server.ca.pem
  SSLStrictSNIVHostCheck off
</VirtualHost>

The conversation between Apache and my client goes like this.

a. client connects to example.com:443 and sends example.com in the TLS handshake.

b. client sends HTTP request.

CONNECT 192.168.1.1:22 HTTP/1.1
Host: example.com
Proxy-Connection: Keep-Alive

c. Apache says HTTP/1.1 400 Bad Request. The Apache error log says

Hostname example.com provided via SNI and hostname 192.168.1.1
provided via HTTP are different. 

It appears that Apache does not look at the Host header other than to see that it is there since HTTP/1.1 requires it. I get identical failed behavior if the client sends Host: foo. If I make the HTTP request to example.com:80 without TLS, then Apache will connect me to 192.168.1.1:22.

I don't completely understand this behavior. Is there something wrong with the CONNECT request? I can't seem to locate the relevant parts of the RFCs that explain all this.

sigjuice
  • 28,661
  • 12
  • 68
  • 93
  • 1
    SNI above means the host name sent in the handshake, not the host header. As written in my answer below mixing SSL and CONNECT proxies is not typical. It looks like Apache is not expecting this at all as it does certificate validation. You can try `SSLStrictSNIVHostCheck off` in Apache. – eckes Feb 16 '13 at 01:04

4 Answers4

41

It's not clear whether you're trying to use Apache Httpd as a proxy server, this would explain the 400 status code you're getting. CONNECT is used by the client, and sent to the proxy server (possibly Apache Httpd, but usually not), not to the destination web server.

CONNECT is used between the client and the proxy server before establishing the TLS connection between the client and the end server. The client (C) connects to the proxy (P) proxy.example.com and sends this request (including blank line):

C->P: CONNECT www.example.com:443 HTTP/1.1
C->P: Host: www.example.com:443
C->P:

The proxy opens a TCP connection to www.example.com:443 (P-S) and responds to the client with a 200 status code, accepting the request:

P->C: 200 OK
P->C: 

After this, the connection between the client and the proxy (C-P) is kept open. The proxy server relays everything on the C-P connection to and from P-S. The client upgrades its active (P-S) connection to an SSL/TLS connection, by initiating a TLS handshake on that channel. Since everything is now relayed to the server, it's as if the TLS exchange was done directly with www.example.com:443.

The proxy doesn't play any role in the handshake (and thus with SNI). The TLS handshake effectively happens directly between the client and the end server.

If you're writing a proxy server, all you need to do for allowing your clients to connect to HTTPS servers is read in the CONNECT request, make a connection from the proxy to the end server (given in the CONNECT request), send the client with a 200 OK reply and then forward everything that you read from the client to the server, and vice versa.

RFC 2616 treats CONNECT as a a way to establish a simple tunnel (which it is). There is more about it in RFC 2817, although the rest of RFC 2817 (upgrades to TLS within a non-proxy HTTP connection) is rarely used.

It looks like what you're trying to do is to have the connection between the client (C) and the proxy (P) over TLS. That's fine, but the client won't use CONNECT to connect to external web servers (unless it's a connection to an HTTPS server too).

Community
  • 1
  • 1
Bruno
  • 119,590
  • 31
  • 270
  • 376
  • 1
    1) Wanted to understand, why would a client ever use HTTP "CONNECT", when it can directly use SSL to talk to the end server? Whether it is "CONNECT" or SSL anyways it would traverse through the configured proxies. 2)Also in which header filed does the client specify the intermediate proxy server address in the "CONNECT" request? – Sandeep May 19 '16 at 10:14
  • @ Sandeep , there is no header for Proxy, instead the client connects to proxy with a socket directory. That's the point of proxing – Cholthi Paul Ttiopic Apr 07 '18 at 04:02
  • So, by CONNECT method, any https data from client is not passed to application level of the intermediary proxy? And just evaluated at TCP level of proxy and relayed to remote server directly? – zzinny Oct 21 '19 at 07:05
4

You're doing everything right. It's Apache that got things wrong. Support for CONNECT over TLS was only added recently (https://issues.apache.org/bugzilla/show_bug.cgi?id=29744) and there's still some things to be ironed out. The issue you're hitting is one of them.

Nikratio
  • 2,338
  • 2
  • 29
  • 43
3

From RFC 2616 (section 14.23):

The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI given by the user or referring resource (generally an HTTP URL, as described in section 3.2.2). The Host field value MUST represent the naming authority of the origin server or gateway given by the original URL.

My understanding is that you need to copy the address from CONNECT line to HOST line. All in all, the address of the resource is 192.168.1.1, and the fact that you are connecting via example.com doesn't change anything from RFC point of view.

sigjuice
  • 28,661
  • 12
  • 68
  • 93
Eugene Mayevski 'Callback
  • 45,135
  • 8
  • 71
  • 121
  • According to section 5.2, "2. If the Request-URI is not an absoluteURI, and the request includes a Host header field, the host is determined by the Host header field value." For CONNECT, Request-URI is not an absoluteURI (section 5.1.2). – sigjuice Jul 06 '11 at 13:27
  • @sigjuice ... So 5.2 just doesn't apply (and why have you referred to it?) – Eugene Mayevski 'Callback Jul 06 '11 at 13:32
  • From 5.1.2, "Request-URI = "*" | absoluteURI | abs_path | authority". CONNECT uses the authority form of the Request-URI. Then, from 5.2 "The exact resource identified by an Internet request is determined by examining both the Request-URI and the Host header field." IHMO, Apache should use the Host header to determine host and not fail with the error "host provided by SNI and host provided by HTTP are different (example.com vs 192.168.1.1). – sigjuice Jul 06 '11 at 17:46
  • @sigjuice You are pulling the wrong variable (sections 5.1 and 5.2) into the equation. As for Apache - most likely they use Host header in certificate management, not taking much care about RFCs. – Eugene Mayevski 'Callback Jul 06 '11 at 18:55
  • If I send the CONNECT via a non-TLS HTTP/1.1 connection to port 80, the Host header still seems irrelevant. I can say "Host: abc" and Apache will still connect to port 22. To me, this looks like a violation of 5.2. – sigjuice Jul 06 '11 at 19:07
  • 1
    @sigjuice: the way I read section 14.23, the `Host` header must be used to indicate the host of the requested resource. Using `CONNECT` doesn't fall in the category where the `Host` header would allow you to choose which virtual host should handle `CONNECT`: the requested resource would still be the end target of the client. That's also consistent with the non-`CONNECT` usage of `Host` for proxy servers as specified in section 14.23. I just don't think name-based selection of the proxy host itself was envisaged. – Bruno Jul 06 '11 at 21:09
  • @Bruno What you say about CONNECT makes sense. Thanks for the explanation. I was hoping Apache would let me CONNECT to 192.168.1.1:22 but think it was doing so within the context of the virtual host example.com. I still believe this would be consistent with 14.23 and 5.2. And this way TLS and HTTP/1.1 would agree on the host name. – sigjuice Jul 06 '11 at 22:40
2

It is quite seldom to see CONNECT Method inside TLS (https). I actually don't know any client who does that (and I would be interested to know who it does, cause I think it is actually a good feature).

Normally the client connects with http (plain tcp) to the proxy and sends the CONNECT method (and host header) to host:443. Then the proxy will make a transparent connection to the endpoint and then the client sends the SSL handshake through.

In this scenario the data is ssl protected "end to end".

The CONNECT method is not really specified, it is only reserved in the HTTP RFC. But typically it is quite simple so it is interoperable. The Method specifies host[:port]. Host: header can simply be ignored. Some additional proxy authentication headers might be needed. When the body of the connection begins no parsing has to happen by the proxy anymore (some do, because they check for valid SSL handshake).

eckes
  • 10,103
  • 1
  • 59
  • 71
  • 1
    BTW: Chrome supports SSL Connections to Proxies: http://www.chromium.org/developers/design-documents/secure-web-proxy – eckes Oct 22 '13 at 02:26