I'm working on a two-way HTTP proxy that that forwards requests from the web browser to a randomly selected third-party proxy and then returns the response to the web browser.
Client -> Proxy -> ([Third-party Proxy 1] or [Third-party Proxy 2]) -> Proxy -> Client
Eventually I want to be able to inspect and change the headers, body, and status code of the response from the third-party proxy and then send the altered response back to the web browser.
This is what I have so far. It's adapted from an implementation by Sujit Pal. The author claims it has some bugs but doesn't offer much detail or how to fix them. It works for very small webpages such as www.example.com, but for anything larger it gets part way through the page resource downloads and then stops. It appears to have something to do with one or both of the connections closing prematurely but I can't see how that is happening. This is my first experience with twisted and I don't know how to debug this. What am I doing wrong?
from twisted.internet import reactor, protocol
from twisted.web import http
class HttpClientProtocol(http.HTTPChannel):
def __init__(self, serverTransport):
self.serverTransport = serverTransport
def sendMessage(self, data):
self.transport.write(data)
def dataReceived(self, data):
self.data = data
self.transport.loseConnection()
def connectionLost(self, reason):
self.serverTransport.write(self.data)
self.serverTransport.loseConnection()
class HttpServerProtocol(http.HTTPChannel):
def dataReceived(self, data):
third_party_proxy_host, third_party_proxy_port = random.choice([
('127.0.0.1', 8080),
('127.0.0.1', 8081)
client = protocol.ClientCreator(reactor, HttpClientProtocol, self.transport)
d = client.connectTCP(third_party_proxy_host, third_party_proxy_port)
d.addCallback(self.forwardToClient, data)
def forwardToClient(self, client, data):
client.sendMessage(data)
class MyHTTPFactory(http.HTTPFactory):
protocol = HttpServerProtocol
reactor.listenTCP(8000, MyHTTPFactory())
reactor.run()