3

Short version: Is there any easy API for encoding an HTTP request (and decoding the response) without actually transmitting and receiving the encoded bytes as part of the process?

Long version: I'm writing some embedded software which uses paramiko to open an SSH session with a server. I then need to make an HTTP request across an SSH channel opened with transport.open_channel('direct-tcpip', <remote address>, <source address>).

requests has is transport adapters, which lets you substitute your own transport. But the send interface provided by BaseAdapter just accepts a PreparedRequest object which (a) doesn't provide the remote address in any useful way; you need to parse the URL to find out the host and port and (b) doesn't provide an encoded version of the request, only a dictionary of headers and the encoded body (if any). It also gives you no help in decoding the response. HTTPAdapter defers the whole lot, including encoding the request, making the network connection, sending the bytes, receiving the response bytes and decoding the response, to urllib3.

urllib3 likewise defers to http.client and http.client's HTTPConnection class has encoding and network operations all jumbled up together.

Is there a simple way to say, "Give me a bunch of bytes to send to an HTTP server," and "Here's a bunch of bytes from an HTTP server; turn them into a useful Python object"?

Tom
  • 7,269
  • 1
  • 42
  • 69

2 Answers2

3

This is the simplest implementation of this that I can come up with:

from http.client import HTTPConnection
import requests
from requests.structures import CaseInsensitiveDict
from urllib.parse import urlparse
from argparse import ArgumentParser

class TunneledHTTPConnection(HTTPConnection):
    def __init__(self, transport, *args, **kwargs):
        self.ssh_transport = transport
        HTTPConnection.__init__(self, *args, **kwargs)

    def connect(self):
        self.sock = self.ssh_transport.open_channel(
            'direct-tcpip', (self.host, self.port), ('localhost', 0)
        )

class TunneledHTTPAdapter(requests.adapters.BaseAdapter):
    def __init__(self, transport):
        self.transport = transport

    def close(self):
        pass

    def send(self, request, **kwargs):
        scheme, location, path, params, query, anchor = urlparse(request.url)
        if ':' in location:
            host, port = location.split(':')
            port = int(port)
        else:
            host = location
            port = 80

        connection = TunneledHTTPConnection(self.transport, host, port)
        connection.request(method=request.method,
                           url=request.url,
                           body=request.body,
                           headers=request.headers)
        r = connection.getresponse()
        resp = requests.Response()
        resp.status_code = r.status
        resp.headers = CaseInsensitiveDict(r.headers)
        resp.raw = r
        resp.reason = r.reason
        resp.url = request.url
        resp.request = request
        resp.connection = connection
        resp.encoding = requests.utils.get_encoding_from_headers(response.headers)
        requests.cookies.extract_cookies_to_jar(resp.cookies, request, r)
        return resp

if __name__ == '__main__':
    import paramiko

    parser = ArgumentParser()
    parser.add_argument('-p', help='Port the SSH server listens on', default=22)
    parser.add_argument('host', help='SSH server to tunnel through')
    parser.add_argument('username', help='Username on SSH server')
    parser.add_argument('url', help='URL to perform HTTP GET on')
    args = parser.parse_args()

    client = paramiko.SSHClient()
    client.load_system_host_keys()
    client.connect(args.host, args.p, username=args.username)

    transport = client.get_transport()

    s = requests.Session()
    s.mount(url, TunneledHTTPAdapter(transport))
    response = s.get(url)
    print(response.text)

There are various options to BaseAdapter.send that it doesn't handle, and it completely ignores issues like connection pooling and so on, but it gets the job done.

itsadok
  • 28,822
  • 30
  • 126
  • 171
Tom
  • 7,269
  • 1
  • 42
  • 69
  • Where is the TunneledHTTPAdapter used? I don't see it referenced anywhere in the code. – Tom Apr 11 '20 at 17:06
  • 1
    @Tom TBH I don't know. I seem to have lost the code where I implemented this. I need to dig out an old laptop some time and if I come across it, I'll post an update. – Tom Apr 15 '20 at 09:15
1

You could write your own SOCKS4 proxy, run it on localhost, then point your HTTP requests at it. For example, https://urllib3.readthedocs.io/en/latest/advanced-usage.html describes how to use a SOCKS proxy with urllib3.

SOCKS4 is basically a simple handshake followed by raw HTTP/TCP traffic. The handshake conveys the target IP address and port. So your proxy can do the handshake to satisfy the client that it is a SOCKS server, then the proxy can send the "real" traffic straight to the SSH session (and proxy the responses in the reverse direction).

The cool thing about this approach is that it will work with tons of clients--SOCKS has been widespread for a long time.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • It's an interesting idea, but waaaaay more complicated than what I'm after. I've figured out a method that works (which I will add as an answer shortly) but it still seems much too complicated. – Tom Sep 27 '17 at 12:43