1

I'm having a strange behaviour with the recv() function.

My C++ (MFC) application with WinSock implements a simple HTTP client (non-blocking socket) for accessing HTML pages on a web server. Some of these pages are taking a few seconds for loading. On Windows 7 this is not a problem, because recv() also returns partial data. But on Windows XP the recv() function always returns SOCKET_ERROR and the error code is WSAEWOULDBLOCK. Only when the connection is finished the data is returned in one access.

Does anyone know this problem? How can I force Windows XP to also receive partial data?

I setted the buffer size (SO_RCVBUF) to 1000 Bytes. On Windows 7 this is also reflected to the TCP Window Size - on XP not.

The real problem which I have with this issue is, that I don't know how to check if the connection is still alive or not. How can I check if a connection is still alive? Or how can I specify a timeout (max time between two received packets from the server)?

Benjamin J.
  • 1,239
  • 1
  • 15
  • 28

3 Answers3

1

By default, a socket operates in blocking mode, so the only way you can get a WSAEWOULDBLOCK error at all is if you explicitly put the socket into non-blocking mode instead. Doing so, you agree to handle WSAEWOULDBLOCK (otherwise, don't use non-blocking mode).

WSAEWOULDBLOCK is not a real error, it is just an indication that the operation you attempted to perform cannot be completed at that moment because it would block the calling thread. You need to detect this "error" and simply retry the same operation again at a later time, preferably after a socket state change is detected.

For recv(), WSAEWOULDBLOCK simply means there is no data available on the socket to be read at that moment. In non-blocking mode, you should be using select() (or WSAEventSelect(), or WSAAsyncSelect(), or Overlapped I/O, or an I/O Completion Port) to detect inbound data before you then read it.

That being said, you are implementing an HTTP client, so you must follow the HTTP protocol properly, regardless of the socket I/O mode you are using, regardless of your socket buffer sizes. You must follow the pseudo code logic I outlined in this answer on another question:

You must follow the rules outlined in RFC 2616. Namely:

  1. Read until the "\r\n\r\n" sequence is encountered. Do not read any more bytes past that yet.

  2. Analyze the received headers, per the rules in RFC 2616 Section 4.4. They tell you the actual format of the remaining response data.

  3. Read the data per the format discovered in #2.

  4. Check the received headers for the presence of a Connection: close header if the response is using HTTP 1.1, or the lack of a Connection: keep-alive header if the response is using HTTP 0.9 or 1.0. If detected, close your end of the socket connection because the server is closing its end. Otherwise, keep the connection open and re-use it for subsequent requests (unless you are done using the connection, in which case do close it).

  5. Process the received data as needed.

In short, you need to do something more like this instead (pseudo code):

string headers[];
byte data[];

string statusLine = read a CRLF-delimited line;
int statusCode = extract from status line;
string responseVersion = extract from status line;

do
{
    string header = read a CRLF-delimited line;
    if (header == "") break;
    add header to headers list;
}
while (true);

if ( !((statusCode in [1xx, 204, 304]) || (request was "HEAD")) )
{
    if (headers["Transfer-Encoding"] ends with "chunked")
    {
        do
        {
            string chunk = read a CRLF delimited line;
            int chunkSize = extract from chunk line;
            if (chunkSize == 0) break;

            read exactly chunkSize number of bytes into data storage;

            read and discard until a CRLF has been read;
        }
        while (true);

        do
        {
            string header = read a CRLF-delimited line;
            if (header == "") break;
            add header to headers list;
        }
        while (true);
    }
    else if (headers["Content-Length"] is present)
    {
        read exactly Content-Length number of bytes into data storage;
    }
    else if (headers["Content-Type"] == "multipart/byteranges")
    {
        string boundary = extract from Content-Type header;
        read into data storage until terminating boundary has been read;
    }
    else
    {
        read bytes into data storage until disconnected;
    }
}

if (!disconnected)
{
    if (responseVersion == "HTTP/1.1")
    {
        if (headers["Connection"] == "close")
            close connection;
    }
    else
    {
        if (headers["Connection"] != "keep-alive")
            close connection;
    }
}

check statusCode for errors;
process data contents, per info in headers list;

As you can see, HTTP requires reading CRLF-delimited lines of text, or fixed lengths of raw bytes. To do that, you must call recv() in a loop until you encounter the terminating CRLF, or have received the expected number of bytes, whichever the case may be. Whether you use a synchronous loop that just ignores WSAEWOULDBLOCK errors while looping, or you use a state machine driven by asynchronous events/callbacks, that is up to you to decide. That doesn't change how you must process the HTTP protocol.

This applies to all versions of Windows (even all platforms that use BSD-style socket APIs). What you are encountering is not a Windows bug at all. It is an underlying flaw in your understanding of how to use socket I/O correctly and effectively.

As for checking if the connection is alive, recv() will return 0 if the server closed the connection gracefully, or will report an error otherwise (usually WSAECONNABORTED or WSAECONNRESET, though there can be others). But an abnormal disconnect may take a long time to detect, so you should implement timeouts in your code instead. In synchronous mode, you can use setsockopt(SO_RCVTIMEO). In non-blocking mode, you can use select(). In asynchronous (overlapped) mode, you can use WaitForSingleObject() on whatever event/object you use to drive your state machine.

Community
  • 1
  • 1
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Let's say the web page needs 5 minutes to be completely transfered. After the connect call I can start a (own) timeout and check it if recv() returns WSAEWOULDBLOCK. On Windows 7 I get partial data via recv(). So let's say every 10 seconds I get new data. On WIndows XP all this date get buffered and transfered to my app via recv() at the end of the transfer (after 5 minutes). But if I have a timeout e. g. with 1 minute, on Windows 7 I don't get in troubles. On Windows XP I will get a problem because the 1 minute timer runs out. – Benjamin J. Mar 30 '17 at 20:24
  • `recv()` does NOT behave the way you claim, on any Windows version (I've coded WinSock apps on most of them, including XP and 7). In non-blocking mode, `recv()` returns whatever data is currently available at that moment, otherwise it returns `WSAEWOULDBLOCK` (if not another error). It can only return `WSAEWOULDBLOCK` when there is NO DATA in the socket. You can use `select()` to wait for new data to arrive. A socket timeout is per-byte. It may take 5 minutes to transfer the complete response, but it won't take 5 minutes to send individual bytes. A 10-30 second timeout is not unreasonable. – Remy Lebeau Mar 30 '17 at 20:31
  • Any trouble you are running into is going to be due to bugs in your own socket code, not in WinSock itself. If you really want people to help you with this, please [edit] your question to show your actual reading code. You are doing something wrong in it. – Remy Lebeau Mar 30 '17 at 20:33
  • The real question is, why are you implementing HTTP manually at all? You should be using an existing HTTP library instead, like Microsoft's WinInet/WinHTTP APIs, or libcurl, or any number of other 3rd party libraries. Let them do all the hard work for you. Even if you fix the obvious flaws in your underlying TCP code, HTTP is not trivial to implement from scratch, it has a lot of rules and nuances to it. I have written HTTP clients from scratch, so I know what I'm talking about. – Remy Lebeau Mar 30 '17 at 20:38
0

You can't expect recv to give you any data on a non-blocking socket. If there's no data available it returns WOULDBLOCK. You just need to call recv again (normally after select notifies you some data is available). Whether you get data on the first (or any) call is going to depend on how fast the server is sending it.

When the socket is closed you'll get a different error from recv, like WSAECONNRESET or WSAENOTCONN. select will also notify you when the socket is closed.

efhard
  • 1
  • I know that I can't expect that recv() returns data. But how long does it need until recv() return WSAECONNRESET or WSAENOTCONN. But this doesn't help. I think I need to set a timeout, because my application should give the user a response, that the connection is not alive and this should not happen only after a long time (e.g. > 1 minute). – Benjamin J. Mar 30 '17 at 19:56
  • There is no specific timeout. You can keep calling recv and keep waiting for data theoretically forever, though usually one side or the other will timeout. When the server side closes the connection you'll get an error. It's up to you to decide when to give up waiting for data. Normally you wait for data using select. Which does take a timeout. But often that'll be a short timeout because you have other sockets to read or write or other tasks to handle. You need to decide when to stop calling recv (or select) and give up. – efhard Mar 30 '17 at 20:02
  • When using the socket in non-blocking mode, you can use `select()` to implement a timeout. When `recv()` returns `WSAEWOULDBLOCK`, call `select()` with a timeout to wait for new data to arrive. In blocking mode, this is not necessary, `recv()` will just wait for new data to arrive before exiting. You can use `setsockopt(SO_RCVTIMEO)` to set a timeout on that wait. Either way, if a timeout elapses, close the socket and move on – Remy Lebeau Mar 30 '17 at 20:08
  • Yes, that is the simplest way if you are waiting for only one socket with `select` and if you have nothing else to do on the thread (no other need for a different timeout on `select`). But if that's the case then you probably have no need for a non-blocking socket. Otherwise you would need to keep track of the last time you got data from `recv`, then the next time before you call `recv` or `select` check if your desired timeout has past. – efhard Mar 30 '17 at 22:05
0

It's very strange.

Today I have changed my software to use blocking sockets. But it still doesn't work on Windows XP. Windows 7 is no problem.

So I thought: Let's try another PC. On this PC (also Windows XP) it does work. Now I tried a 3rd PC with Windows XP and here it also works.

I still don't know what the problem is but I think there must be a bug with the PC.

Benjamin J.
  • 1,239
  • 1
  • 15
  • 28