1

For example:

Client Side

...
socket.connect(server_address)
data = some_message_less_than_100_bytes
socket.sendall(data)
...

Server Side

...
socket.accept()
socket.recv(1024)
...

Is the server side guaranteed to receive the data in one recv()?

If not, how does the standard solution using header for specifying message length even works? The header itself could have been split and we have to check if header has been correctly received. Or the header is fixed length? So that the receiver can always interpret the first few bytes in the same way no matter in how many pieces that data is sent?

Actually I'm trying to do something like this

Client

while():
    send()
    recv()

Server

recv()
while():
    send() # Acknowledge to client
    recv()

which is suggested by ravi in Linux socket: How to make send() wait for recv()

but I figured out the problem described above.

Is the ravi's answer assuming that both client and server will receive what the other sent in a single recv()?

Update

I would very like to post the image but I can't because of low reputation...

Following link is the HTTP Frame Format

https://datatracker.ietf.org/doc/html/rfc7540#section-4

It indeed used a fixed length solution, so that no matter in how many pieces the header is split it can work with the same way.

So I guess, some sort of 'fixed' length is the only solution? Even if the header size itself is variable, it then probably have some promised bits to indicate how long the header would be. Am I right?

jiyolla
  • 27
  • 6
  • 1
    It all depends on the socket _type_, which you didn't specify. – Armali May 12 '21 at 22:10
  • 1
    Oh, I meant tcp – jiyolla May 13 '21 at 00:37
  • @성진영 In your update, an HTTP/2 Frame is a FIXED-length header that describes a VARIABLE-length payload. You would read in the 9-byte header, then interpret its first 3 bytes to determine the payload's length, and then read in that specified number of bytes. This is just ONE example of a header+payload format. FIXED-length headers are certainly not a requirement of TCP (ie HTTP 1.1 and earlier did not use Frames, they used VARIABLE-length CRLF-delimited headers instead) – Remy Lebeau May 13 '21 at 00:59
  • Thank you for clarifying. What about the last statement in the Update section? Doesn't variable-length header also used some fixed bits to indicate how long would be the header just like how the fixed header help informing payload? – jiyolla May 13 '21 at 01:06
  • 1
    Problem solved. Delimiter could be used to totally avoid any kind of fixed bits. So either 'promised position' or 'promised character'. – jiyolla May 13 '21 at 01:25

2 Answers2

2

Is the server side guaranteed to receive the data in one recv()?

No. TCP is a byte stream, not a message protocol. While it will likely work with small messages and an empty send buffer in most cases, it will start to fail if the data send get larger than the MTU of the underlying data link. TCP does not guarantee any atomar send-recv pair though for anything but a single octet. So don't count on it even for small data.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • Ohk, you confirmed me it is wrong to even assume more than a single byte will be received by recv(). That's good. Actually, searching for HTTP implementation guide, I found that HTTP has 72 bits fixed frame header in which first 24 bits represent the length of the payload following. So I guess, what happens in HTTP connection is exactly "So that the receiver can always interpret the first few bytes in the same way no matter in how many pieces that data is sent?" – jiyolla May 13 '21 at 00:41
  • 1
    "*I found that HTTP has 72 bits fixed frame header*" - HTTP/2 does. Earlier versions of HTTP do not. – Remy Lebeau May 13 '21 at 01:01
  • @JinyoungSung: Fixed length messages or fixed length prefixes which contain the message length are common ways to ad message semantics on top of the byte stream TCP provides. HTTP/1 though is differently: there is a variable length header with a fixed separator string (empty line). The size of the message body is usually contained inside this variable length header (as `Content-length`) but can also be done differently (like with HTTP chunked transfer encoding). – Steffen Ullrich May 13 '21 at 04:54
2

Is the server side guaranteed to receive the data in one recv()?

For UDP, yes. recv() will return either 1 whole datagram, or an error. Though, if the buffer size is smaller than the datagram then the data will be truncated and you can't recover it.

For TCP, no. The only guarantee you have is that if no error occurs then recv() will return at least 1 byte but no more than the specified buffer size, it can return any number of bytes in between.

If not, how does the standard solution using header for specifying message length even works? The header itself could have been split and we have to check if header has been correctly received. Or the header is fixed length?

It can go either way, depending on the particular format of the header. Many protocols use fixed-length headers, and many protocols use variable-length headers.

Either way, you may have to call send() multiple times to ensure you send all the relevant bytes, and call recv() multiple times to ensure you receive all them. There is no 1:1 relationship between sends and reads in TCP.

Is the ravi's answer assuming that both client and server will receive what the other sent in a single recv()?

Ravi's answer makes no assumptions whatsoever about the number of bytes sent by send() and received by recv(). His answer is presented in a more higher-level perspective. But, it is very trivial to force the required behavior, eg:

int sendAll(int sckt, void *data, int len)
{
    char *pdata = (char*) data;
    while (len > 0) {
        int res = send(sckt, pdata, len, 0);
        if (res > 0) {
            pdata += res;
            len -= res;
        }
        else if (errno != EINTR) {
            if ((errno != EWOULDBLOCK) && (errno != EAGAIN)) {
                return -1;
            }
            /*
            optional: use select() or (e)poll to
            wait for the socket to be writable ...
            */
        }
    }
    return 0;
}

int recvAll(int sckt, void *data, int len)
{
    char *pdata = (char*) data;
    while (len > 0) {
        int res = recv(sckt, pdata, len, 0);
        if (res > 0) {
            pdata += res;
            len -= res;
        }
        else if (res == 0) {
            return 0;
        }
        else if (errno != EINTR) {
            if ((errno != EWOULDBLOCK) && (errno != EAGAIN)) {
                return -1;
            }
            /*
            optional: use select() or (e)poll to
            wait for the socket to be readable ...
            */
        }
    }
    return 1;
}

This way, you can use sendAll() to send the message header followed by the message data, and recvAll() to receive the message header followed by the message data.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • So, in your recvAll() and sendAll(), I think you used 'int len' as the promised fixed length of header, am I right? – jiyolla May 13 '21 at 00:36
  • 1
    It is whatever size the caller needs it to be. It is not limited to just the header. For example, [Macattack's answer](https://stackoverflow.com/a/19795805/65863) to the same question that Ravi answered, could be adapted to use `sendAll()`/`recvAll()` like this: `int sendLen = strlen(sendBuff); sendAll(sockfd, &sendLen, sizeof(sendLen)); sendAll(sockfd, sendBuff, sendLen);` ... `int len = 0; recvAll(connfd, &len, sizeof(len)); recvBuff = malloc(len); recvAll(connfd, recvBuff, len); ... free(recvBuf);` – Remy Lebeau May 13 '21 at 00:56
  • And, I think even in variable-length header, there would be some 'promised bits' to indicate how long the header would be or where the payload would start. Isn't it? – jiyolla May 13 '21 at 00:58
  • So in Macattack's answer, I think he actually used the first few bytes of 'the size of int' as the promised fixed length to indicate how long would the payload be? That' why he said there actually is an assumption made which is the client and the server share the same int size? – jiyolla May 13 '21 at 01:02
  • 1
    @성진영 "*I think even in variable-length header, there would be some 'promised bits' to indicate how long the header would be or where the payload would start*" - there MIGHT be, or there MIGHT NOT be. That is entirely up to the design of the protocol. Many commonly used Internet protocols use variable-length headers without any fixed bits at all. Such as versions of HTTP prior to HTTP/2, which use a variable-length header consisting of CRLF-delimited lines terminated by 2 sequential CRLFs. The header contains lines that describe the presence and length of the message body. – Remy Lebeau May 13 '21 at 01:03
  • OH!!! Using a unique delimiter!!!Thank you so much!! – jiyolla May 13 '21 at 01:07
  • 1
    @성진영 "*in Macattack's answer, I think he actually used the first few bytes of 'the size of int' as the promised fixed length to indicate how long would the payload be?*" - no, he uses the VALUE of the `int` to specify the payload length. The `int` itself IS THE HEADER. He sets the `int`'s value to the string's length, then sends the `int` as the header, then sends the string's characters as the payload. The receiver reads the `int` as the header, then reads the specified number of characters as the payload. – Remy Lebeau May 13 '21 at 01:08
  • So to conclude, there only two way to send variable length of message in a single socket communication. 1. Used promised bits to indicate the total length. 2. Used promised delimiter to for receiver to know whether it has got all the message(I mean it's not 'all' the message, but the receiver can anyone has some idea about what to expect) – jiyolla May 13 '21 at 01:10
  • "in Macattack's answer, I think he actually used the first few bytes of 'the size of int' as the promised fixed length to indicate how long would the payload be?" by this, I actually meant the size of int is kind of used as fixed length for the header. So normally it would be 4 bytes. The receiver assumes the first 4 bytes as the header, which may not be true depending on the receiver's environment. – jiyolla May 13 '21 at 01:13
  • 1
    @성진영 "*So to conclude, there only two way to send variable length of message in a single socket communication*" - basically, yes. You can either 1) send the data's length before sending the data (DOES NOT require a FIXED-length header, but it helps). Read the length (however needed) and then read how much it says; or 2) send the data followed by a unique terminator. Read until the delimiter is received. – Remy Lebeau May 13 '21 at 01:13
  • Ohk, it was unclear. 'The size of int' as promise to how long the header would be. 'The content of int'(which depend on the size of int, because interpret depend on it) as the length of the payload. – jiyolla May 13 '21 at 01:16
  • 2
    @성진영 "*So normally it would be 4 bytes. The receiver assumes the first 4 bytes as the header, which may not be true depending on the receiver's environment*" - a good protocol designer would be explicit about the format used regardless of environment. For instance, by using `uint32_t` in big-endian (network byte order). The sender can use `htonl()` before sending the integer, and the receiver can use `ntohl()` after receiving it. – Remy Lebeau May 13 '21 at 01:16
  • Everything clear now! Thank you very much. – jiyolla May 13 '21 at 01:16
  • can you help me with my question here: https://stackoverflow.com/questions/71837210/python-socket-works-fine-when-debugging-line-by-line-but-not-working-on-the-full – Arash Apr 12 '22 at 06:24