I am using Python 2.6.5 and I am trying to capture the raw http request sent via HTTP, this works fine except when I add a proxy handler into the mix so the situation is as follows:
- HTTP and HTTPS requests work fine without the proxy handler: raw HTTP request captured
- HTTP requests work fine with proxy handler: proxy ok, raw HTTP request captured
- HTTPS requests fail with proxy handler: proxy ok but the raw HTTP request is not captured!
The following questions are close but do not solve my problem:
- How do you get default headers in a urllib2 Request? <- My solution is heavily based on this
- Python urllib2 > HTTP Proxy > HTTPS request
- This sets the proxy for each request <- Did not work and doing it once at the start via an opener is more elegant and efficient (instead of setting the proxy for each request)
This is what I am doing:
class MyHTTPConnection(httplib.HTTPConnection):
def send(self, s):
global RawRequest
RawRequest = s # Saving to global variable for Requester class to see
httplib.HTTPConnection.send(self, s)
class MyHTTPHandler(urllib2.HTTPHandler):
def http_open(self, req):
return self.do_open(MyHTTPConnection, req)
class MyHTTPSConnection(httplib.HTTPSConnection):
def send(self, s):
global RawRequest
RawRequest = s # Saving to global variable for Requester class to see
httplib.HTTPSConnection.send(self, s)
class MyHTTPSHandler(urllib2.HTTPSHandler):
def https_open(self, req):
return self.do_open(MyHTTPSConnection, req)
Requester class:
global RawRequest
ProxyConf = { 'http':'http://127.0.0.1:8080', 'https':'http://127.0.0.1:8080' }
# If ProxyConf = { 'http':'http://127.0.0.1:8080' }, then Raw HTTPS request captured BUT the proxy does not see the HTTPS request!
# Also tried with similar results: ProxyConf = { 'http':'http://127.0.0.1:8080', 'https':'https://127.0.0.1:8080' }
ProxyHandler = urllib2.ProxyHandler(ProxyConf)
urllib2.install_opener(urllib2.build_opener(ProxyHandler, MyHTTPHandler, MyHTTPSHandler))
urllib2.Request('http://www.google.com', None) # global RawRequest updated
# This is the problem: global RawRequest NOT updated!?
urllib2.Request('https://accounts.google.com', None)
BUT, if I remove the ProxyHandler it works!:
global RawRequest
urllib2.install_opener(urllib2.build_opener(MyHTTPHandler, MyHTTPSHandler))
urllib2.Request('http://www.google.com', None) # global RawRequest updated
urllib2.Request('https://accounts.google.com', None) # global RawRequest updated
How can I add the ProxyHandler into the mix while keeping access to the RawRequest?
Thank you in advance.