1

I am practicing on socket programming, trying to send a http request and then receive some SOCK_STREAM data from a server. This question is about the receive process.

Approach 1: my current approach simply use a while loop

while((received = recv(sockfd, response, BUFSIZ, 0)) > 0) {
        //process response
}

if(received < 0) {
        perror("Error receiving data!\n");
}

basically, make recv() calls until it finishes/closes (which returns 0 as ) or it hits an error.

However, I rarely see people doing this, most people on SO are suggesting of using length indicator, i.e., here and here. Makes wonder if I missed anything with the above approach.

Approach 2: length indicator approach

Okay, regarding the most preferred length indicator approach, how to get the total length of the entire recv message?

Giving example as below

enter image description here

It's a HTTP header I retrieved, there's a field called Content-Length. Is it the length of the entire message (size of header + body = 468)?

GabrielChu
  • 6,026
  • 10
  • 27
  • 42

1 Answers1

3

Calling recv() in a loop until a disconnect/error is reported will only work correctly if the server closes the connection after sending the response (usually because the client did not request a keep-alive, or the server decided not to honor the keep-alive), and the server indicates in the response (or the protocol dictates) that socket closure is EOF for the message data.

In HTTP 0.9, there is no response line or response headers at all. The requested file is sent as-is by itself, and EOF is indicated by socket closure.

In HTTP 1.0 and later, the Content-Length header (if present at all) is NOT the total size of the complete HTTP message, only of the message body. The message header is variable-length, terminated by a 0x0D 0x0A 0x0D 0x0A sequence of bytes.

The CORRECT way to read an HTTP response is to do the following:

If the request is using HTTP 0.9:

  • read from the socket until the connection is closed/errors.

  • then close the socket.

This is covered by the original W3C specification of HTTP.

Otherwise, if the response is using HTTP 1.0 or later:

  • read from the socket until you encounter the 0x0D 0x0A sequence of bytes denoting the end of the response line, which contains the HTTP version, Status Code, and Reason text.

  • then read from the socket until you encounter the 0x0D 0x0A 0x0D 0x0A sequence of bytes denoting the end of the response headers.

  • then analyze the response line and headers to know if a message body is present, and if so in what format it is being sent as, which dictates how you must read it.

  • read the message body, if present:

    • if the response Status code is 1xx, 204, or 304, or if the response is to a HEAD request, no message body is present.
  • otherwise, if a Transfer-Encoding header is present and has a value other than identity, read the message body in chunks until a 0-length chunk is read.

  • otherwise, if a Content-Length header is present, read from the socket until the exact number of bytes specified have been read, no more, no less.

  • otherwise, if the Content-Type header indicates a multipart/... media type, read from the socket and parse the MIME data until the final terminating MIME boundary is reached.

  • otherwise, read from the socket until the connection is closed.

  • if the response is not read in full successfully, or if a keep-alive is NOT being used (a Connection: close header is present in an HTTP 1.1 response, or a Connection: keep-alive header is not present in an HTTP 1.0 response), then close the socket.

This is covered by RFC 2616 (Section 4.4 and Section 8), and by RFC 7230 (Section 3.3.3 and Section 6).

Community
  • 1
  • 1
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770