4

I am having a problem with a misbehaving HTTP Proxy server. I have no control over the proxy server, unfortunately -- it's an 'enterprise' product from IBM. The proxy server is part of a service virtualization solution being leveraged for software testing.

The fundamental issue (I think*) is that the proxy server sends back HTTP/1.0 responses. I can get it to work fine from SOAP UI ( A Java application) and curl from the command line, but Python refuses to connect. From what I can tell, Python is behaving correctly, and the other two are not, as the server expects HTTP/1.1 responses (it wants Host headers, at the very least, to route the service request to a given stub).

Is there a way to get Requests, or the underlying urllib3, or the even farther down http lib to always use http1.1, even if the other end appears to be using 1.0?

Here is a sample program (unfortunately, it requires you to have an IBM Ration Integration Tester installation with RTCP to really replicate) to reproduce the problem:

import http.client as http_client
http_client.HTTPConnection.debuglevel = 1
import logging
import requests
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True

requests.post("https://host:8443/axl", 
            headers={"soapAction": '"CUCM:DB ver=9.1 updateSipTrunk"'}, 
            data='<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://www.cisco.com/AXL/API/9.1"><soapenv:Header/><soapenv:Body><tns:updateSipTrunk><name>PLACEHOLDER</name><newName>PLACEHOLDER</newName><destinations><destination><addressIpv4>10.10.1.5</addressIpv4><sortOrder>1</sortOrder></destination></destinations></tns:updateSipTrunk></soapenv:Body></soapenv:Envelope>', 
            verify=False)

(Proxy is configured via HTTPS_PROXY environment variable)

Debug output before the error, note the HTTP/1.0:

INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): host.com
send: b'CONNECT host.com:8443 HTTP/1.0\r\n'
send: b'\r\n'
header: Host: host.com:8443

header: Proxy-agent: Green Hat HTTPS Proxy/1.0

The exact error text that occurs in RHEL 6 is:

requests.exceptions.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:646)

Even though the Host header is shown here, it does NOT show up on the wire. I confirmed this with a tcpdump:

14:03:14.315049 IP sourcehost.53214 > desthost.com: Flags [P.], seq 0:32, ack 1, win 115, options [nop,nop,TS val 2743933964 ecr 4116114841], length 32
        0x0000:  0000 0c07 ac00 0050 56b5 4044 0800 4500  .......PV.@D..E.
        0x0010:  0054 3404 4000 4006 2ca0 0af8 3f15 0afb  .T4.@.@.,...?...
        0x0020:  84f8 cfde 0c7f a4f8 280a 4ebd b425 8018  ........(.N..%..
        0x0030:  0073 da46 0000 0101 080a a38d 1c0c f556  .s.F...........V
        0x0040:  XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX  ..CONNECT.host
        0x0050:  XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX  xx:8443.HTTP/1.0
        0x0060:  0d0a                          

When I curl it with verbose, this is what the output looks like:

* About to connect() to proxy proxy-host.com port 3199 (#0)
*   Trying 10.**.**.** ... connected
* Connected to proxy-host.com (10.**.**.**) port 3199 (#0)
* Establish HTTP proxy tunnel to host.com:8443
> CONNECT host.com:8443 HTTP/1.1
> Host: host.com:8443
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Proxy-Connection: Keep-Alive
> soapAction: "CUCM:DB ver=9.1 updateSipTrunk"
>
< HTTP/1.0 200 OK
< Host: host.com:8443
< Proxy-agent: Green Hat HTTPS Proxy/1.0
<
* Proxy replied OK to CONNECT request
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /path/to/store/ca-bundle.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

Truncated after this point. You can see the HTTP/1.0 response from the proxy after connecting. The curl's tcpdump also clearly shows the host header, as well as HTTP 1.1.

*I can't be entirely sure this is the fundamental issue, as I can't test it. I do see HTTP/1.0 responses, and can tell that my non-working Python code sends CONNECT HTTP/1.0 messages, while the working Java sends HTTP/1.1 messages, as does Curl. It's possible the problem is unrelated (although I find that unlikely) or that Python is misbehaving, and not Java/curl. I simply don't know enough to know for sure.

So, is there a way to force urllib3/requests to use HTTP v1.1 at all times?

Keozon
  • 998
  • 10
  • 25
  • 1
    It looks like it is receiving the host header. It looks like the Proxy connect is fine, but its failing the SSL negotiation after that for some reason. – Max Apr 13 '17 at 00:28
  • What makes you say its receiving the host header? It's not part of the CONNECT message on the wire. It is part of the CONNECT message for the curl (I didn't post the curl tcpdump here) – Keozon Apr 13 '17 at 00:37
  • I think the HTTP/1.0 response to CONNECT is fine, and is not causing the problem. It's fundamentally indicating it made the tunnel, and anything after that is being passed opaquely through the proxy anyway. As others said, the reported error indicates a TLS handshake problem. – Adrien Apr 13 '17 at 00:41
  • You're receiving a host header from the Proxy server, not sending it, it doesn't look like it's required, since all the information for CONNECT is in the URL). At least, that's what I'm interpreting the logging as: printing the host and Proxy-agent header it receives. At that point, it has to negotiate TLS, and then send a whole new set of headers. – Max Apr 13 '17 at 00:41
  • I also pursued the TLS route for a long time. Days, actually. I updated python, checked OpenSSL versions, scoured the internet for incompatibilities between openSSL and JSSE... Found nothing. I have no issues connecting to anything else, just this. I'm not saying that it can't be an SSL issue.... but this is the only significant difference in the traffic between the successful Java client and the failed Python one. If you have something specific in mind, though, I'd be glad to try it. – Keozon Apr 13 '17 at 00:48
  • Have you tried handling the proxy connection with `socks` ? – t.m.adam Apr 13 '17 at 01:19
  • You can see the source line 646 the error is pointing to in CPython source browser: https://github.com/python/cpython/blob/2.7/Modules/_ssl.c#L646 . It's _"The handshake operation timed out"_ (assuming you have 2.7). `requests` is doing you a disservice here by failing to provide all the error details. – ivan_pozdeev Apr 13 '17 at 01:31
  • Since it's you local business app installation, you should be able to get the server's private key and decipher SSL traffic e.g. with Wireshark. You should also capture the traffic both before and after the proxy to see if anything is missing. – ivan_pozdeev Apr 13 '17 at 01:37

2 Answers2

4

httplib (which requests relies upon for HTTP(S) heavy lifting) always uses HTTP/1.0 with CONNECT:

Lib/httplib.py:788:

def _tunnel(self):
    self.send("CONNECT %s:%d HTTP/1.0\r\n" % (self._tunnel_host,
        self._tunnel_port))
    for header, value in self._tunnel_headers.iteritems():
        self.send("%s: %s\r\n" % (header, value))
    self.send("\r\n")
    <...>

So you can't "force" it to use "HTTP/1.1" here other than by editing the subroutine.


This MAY be the problem if the proxy doesn't support HTTP/1.0 - in particular, 1.0 does not require a Host: header, and indeed, as you can see by comparing your log output with the code above, httplib does not send it. While, in verity, a proxy may expect it regardless. But if this is the case, you should've gotten an error from the proxy or something in response to CONNECT -- unless the proxy is so borken that it substitutes some default (or garbage) for Host:, returns 200 anyway and tries to connect God-knows-where, at which point you're getting timeouts.

You can make httplib add the Host: header to CONNECT by adding it to _tunnel_headers (indirectly):

s=requests.Session()
proxy_url=os.environ['HTTPS_PROXY']
s.proxies["https"]=proxy_url
# have to specify proxy here because env variable is only detected by httplib code
#while we need to trigger requests' proxy logic that acts earlier
# "https" means any https host. Since a Session persists cookies,
#it's meaningless to make requests to multiple hosts through it anyway.

pm=s.get_adapter("https://").proxy_manager_for(proxy_url)
pm.proxy_headers['Host']="host.com"
del pm,proxy_url
<...>
s.get('https://host.com')
Community
  • 1
  • 1
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
  • Thanks for the advice. I tried adding the header, there, but I still don't see it going over the wire. I'm going to strip away requests and go straight to the source httplib... which will probably be more work than it's worth, but I need to get to the bottom of this. – Keozon Apr 13 '17 at 13:41
  • Scratch that, I do see the header, but in the next packet. In curl, its in the same packet. This is going to be harder to investigate than I thought... – Keozon Apr 13 '17 at 14:01
  • @Keozon It's Python, an interpreted language, for God's sake! Just step through the code in `pdb` and pinpoint the place where it goes wrong. – ivan_pozdeev Apr 13 '17 at 15:32
  • 2
    This ended up being a combination of user error, and dns configuration on the proxy server. The Python script was calling a hostname, not FQDN, which was failing to lookup on the proxy after connect. It was overlooked for a long time, since I could easily connect directly, and through a different proxy, and via HTTP. Curl just happened to be fully qualified, or it would have failed, as well. Thanks for your help! Marking as accepted, as you answered the direct question accurately, and gave some advice with where to go next. – Keozon Apr 13 '17 at 21:26
1

If you do not depend on the requests library you may find the following snippet useful:

import http.client

conn = http.client.HTTPSConnection("proxy.domain.lu", 8080)
conn.set_tunnel("www.domain.org", 443, headers={'User-Agent': 'curl/7.56.0'})
conn.request("GET", "/api")
response = conn.getresponse()

print( response.read() )