How to parse HTTP response using c++

Question

I have a c++ program that use socket to make HTTP POST Request. My question is how to parse the response that the server sent and put it in string vector.

Here is my code for receive:

char buffer[1028];
recv(socket, buffer, sizeof(buffer), 0);
printf("%s", buffer);

Then here is the response:

    HTTP/1.1 200 OK
    Data: Mon, 19 May 2014 12:46:36 GMT
    Server: Apache/2.4.9 (Win32) OpenSSL/0.9.8y PHP/5.4.27
    X-Powered-By: PHP/5.4.27
    Coneten-Length: 28
    Content-Type: text/html

    All done! Do some stuff now.

And what I want to get from this response are 3 message. The 200 OK, the 28 (content-length) and the actual response message "All done! Do some stuff now."

And here is my expected result:

response[0] = 200 OK;
response[1] = 28
response[2] = All done! Do some stuff now

Go hunt down a good c++ http class or library. Doing this yourself seems a bit crazy in this day and age... unless this is homework? — Rook, May 19 '14 at 11:35
Just like any other string? This is way too broad. Your question is essentially "How do I parse a string". Also "Coneten-Length" seems dubious. — luk32, May 19 '14 at 11:35
Here let me give you a Jeopardy answer: What is [libcurl](http://curl.haxx.se/libcurl/)? — Mgetz, May 19 '14 at 11:38
Firstly, you can't be sure to get all the data after one call to `recv`, even if there's ultimately less than 1028 bytes - you should check the return value from `recv` and loop reading further into the buffer until you hit some end-of-message delimiter or end-of-file (disconnection). Secondly, to populate your `vector` you'll need to find and extract specific text: there are hundreds of ways to do that - for a "C++" approach you could try `std::istringstream iss(buffer);` then use `getline` and `>>` - you'll find lots of examples in any introductory tutorial on "iostreams". — Tony Delroy, May 19 '14 at 11:42

score 3 · Answer 1 · answered May 19 '14 at 11:40

3

Use regex expressions to retrieve the status code and content length. Try something like this:

boost::regex regexStatus("^HTTP/\\d\\.\\d (\\d{3} .+)$");
boost::regex regexContentLength("^Content-Length: (\\d+)$");

Take the content of the first match group for both matches and convert the content length to an integer by using boost::lexical_cast. Then split the string at the two newlines and read as many bytes from the second split string as the content length indicates.

answered May 19 '14 at 11:40

jasal

1,044
6
14

2

+1 that's a reasonable hack, though strictly speaking there are lots of expectations of HTTP parsing like ignoring leading and embedded whitespace, case insensitivity, potential version numbers like 2.13 or 12.3 etc. (see e.g. [here] for starters(www.w3.org/Protocols/rfc2616/rfc2616-sec3.html)). – Tony Delroy May 19 '14 at 12:12
1

That's true. If you want a bullet-proof implementation either read the RFC thoroughly or use an existing library like libcurl. – jasal May 19 '14 at 12:17
I think libcurl does not offer such functionality: http://stackoverflow.com/questions/4580548/how-does-one-parse-http-headers-with-libcurl – Sogartar Sep 21 '16 at 10:15

How to parse HTTP response using c++

1 Answers1

Linked