How to extract JSON data from a response containing a header and body?

Question

this is my first question posed to Stack Overflow, because typically I can find the solutions to my problem here, but for this particular situation, I cannot. I am writing a Python plugin for my compiler that outputs REST calls in various languages for interaction with an API. I am authenticating with the socket and ssl modules by sending a username and password in the request body in JSON form. Upon successful authentication, the API returns a response in the following format with important response data in the body:

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Date: Tue, 05 Feb 2013 03:36:18 GMT
Vary: Accept-Charset, Accept-Encoding, Accept-Language, Accept
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST,OPTIONS,GET
Access-Control-Allow-Headers: Content-Type
Server: Restlet-Framework/2.0m5
Content-Type: text/plain;charset=ISO-8859-1
Content-Length: 94

{"authentication-token":"<token>","authentication-secret":"<secret>"}

This is probably a very elementary question for Pythonistas, given its powerful tools for String manipulation. But alas, I am a new programmer who started with Java. I would like to know what would be the best way to parse this entire response to obtain the "<token>" and "<secret>"? Should I use a search for a "{" and dump the substring into a json object? My intuition is telling me to try and use the re module, but I cannot seem to figure out how it would be used in this situation, since the pattern of the token and secret are obviously not predictable. Because I have opted to authenticate with a low-level module set, this response is one big String obtained by constructing the header and appending JSON data to it in the body, then executing the request and obtaining the response with the following code:

#Socket configuration and connection execution
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn = ssl.wrap_socket(sock, ca_certs = pem_file)
conn.connect((host, port))
conn.send(req)

response = conn.recv()
print(response)

The print statement outputs the first code sample. Any help or insight would be greatly appreciated!

This might help you (http://stackoverflow.com/questions/6386308/http-requests-and-json-parsing-in-python) — OnesimusUnbound, Feb 05 '13 at 03:57
I was running into many problems trying to authenticate with SSL using the requests library (using a Trust Store). As a result, I cannot just dump the response into a json object directly because it will raise a ValueError since the response contains both the header and the body. — Andrew Harasta, Feb 05 '13 at 04:05

score 3 · Accepted Answer · answered Feb 05 '13 at 04:07

3

HTTP headers are split from the rest of the body by a \r\n\r\n sequence. Do something like:

import json

...

(headers, js) = response.split("\r\n\r\n")
data = json.loads(js)
token = data["authentication-token"]
secret = data["authentication-secret"]

You'll probably want to check the response, etc, and various libraries (e.g. requests) can do all of this a whole lot easier for you.

answered Feb 05 '13 at 04:07

hrunting

3,857
25
23

Thank you! Knowing that the header and body are split in a standard way is very useful. This is precisely the elegant solution I was looking for. Unfortunately, either I was unable to figure out how to properly use requests for this authentication, or it's just not very well supported yet. I need to use a trust store and submit the username and password in json format in the body of the request. Could not find a way to do that with the library's current implementation. Anyways, thanks again! – Andrew Harasta Feb 05 '13 at 04:16
2

it is a bad idea, trying to reimplement http parser. It fails if server uses "\n", it fails if the response is not 7-bit pure (given the Content-type header in the question). How do you know when to stop reading the response without Content-length header if the server doesn't close the connection. It is equivalent to parsing html using a regex. It is fragile. Real parser such as [http-parser](http://pypi.python.org/pypi/http-parser) should be used instead. – jfs Feb 05 '13 at 04:29
@AndrewHarasta: if you don't know how to use `requests` with a trust store; you could [ask a question specifically about it](http://stackoverflow.com/questions/ask) (you could probably export necessary certificates from the trust store to a pem file given that it is already present in your code) – jfs Feb 05 '13 at 04:33
@J.F.Sebastian I suppose you are correct in that I should ask the question, since all other questions pertaining to the requests library are slightly different from mine, and it seems to be everyone's favorite tool for http interactions. This solution did work, but the potential problems you have raised here seem like valid concerns. I will look into using the parser, and try re-implementing requests. Thanks. – Andrew Harasta Feb 05 '13 at 14:35

How to extract JSON data from a response containing a header and body?

1 Answers1

Linked