requests
doesn't have the status line and headers in raw form. You never need these in raw form, a RFC compliant response can be reconstructed trivially from the data you do have. requests
uses the urllib3
library as its basis, and that library, in turn, uses the Python standard library http.client
module. That module doesn't give you the raw data either.
Instead, the status line and headers are parsed directly into the constituent parts, in http.client.HTTPResponse._read_status()
and http.client.parse_headers()
(the latter delegating to the email.parser.Parser().parsestr()
method to parse the headers into a http.client.HTTPMessage()
instance). Only the results of these parse operations are used.
You could try to wrap the urllib3 connection object (via the get_connection()
hook implemented on a requests
transport adapter). Connection objects have a .connect()
method with supporting methods that create socket objects, and if you were to wrap those in a file-like object and then peeked at the .readline()
call data, you could capture and store the raw data there.
However, if you are debugging a broken HTTP server, I'd not bother with trying to bend requests
and its stack to your will here. Just use curl --include --raw <url>
on the command line instead (with perhaps --verbose
added).
Another option would be to use the http.client
library directly, make the connection, send your outgoing headers with HTTPConnection.request()
, then not use getresponse()
but just read directly from conn.sock
.