Proper way of breaking a TCP stream into messages (in C )

Question

I'm working on a C project that uses TCP to exchange data between a client and a server. Thanks to the comment to my previous question, and looking at answers to previous questions, I realized that a single send operation may need multiple recv in order to completely send a message, so I wrote the following code.

Sending

    void send_message(int clientSocket, char message [], int length ){
        int sent = 0;
        while(sent < length){
            sent += send(clientSocket, message + sent , length - sent , 0);
        }
    }

Receiving

    void receive_message(int clientSocket, char message [], int length){
        int received = 0;
        while(received < length){
        received += recv(clientSocket, message + received , length - received , 0);
        }
    }

If i'm just trying to send a single message, this works without problems. However, when I try to send multiple messages one after the other, the sending side manages to complete the operations and exit the while loops (which should mean that all the bytes have been sent); but the receiving side doesn't, and it gets stuck when one of the recv keeps reading 0 bytes from the socket. Here's my code.

Client

#include<stdlib.h>
#include<string.h>
#include<inttypes.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include<sys/socket.h>

void send_message(int clientSocket, char message [], int length ){
    int sent = 0;
    while(sent < length){
        sent += send(clientSocket, message + sent , length - sent , 0);
    }
}
void receive_message(int clientSocket, char message [], int length){
    int received = 0;
    while(received < length){
        received += recv(clientSocket, message + received , length - received , 0);
    }
}
int main(){
    //SETTING UP THE SOCKET
    struct sockaddr_in serverAddr;
    socklen_t addr_size;
    int clientSocket = socket(AF_INET, SOCK_STREAM, 0);
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(7899);
    //REPLACE  SERVER_IP_ADDRESS WITH YOUR ADDRESS
    inet_pton(AF_INET,"SERVER_IP_ADDRESS9999",&serverAddr.sin_addr);
    memset(serverAddr.sin_zero, '\0', sizeof serverAddr.sin_zero);
    addr_size = sizeof serverAddr;
    connect(clientSocket, (struct sockaddr *) &serverAddr, addr_size);

    // Testing
    char message [3000]= "";
    memset(message, 0,3000);
    for(int i = 0 ; i< 20; i ++){
        //Filling the string
        for(int j=0; j<2999; j++){
            message[j] = 'A' + i;
        }
        send_message(clientSocket, message, 3000);
        printf("i = %d \n %s\n",i, message);
    }
}

Server

#include <stdio.h>
#include<stdlib.h>
#include<string.h>
#include<inttypes.h>
#include <netinet/in.h>
#include<sys/socket.h>

void send_message(int clientSocket, char message [], int length ){
    int sent = 0;
    while(sent < length){
        sent += send(clientSocket, message + sent , length - sent , 0);
    }
}
void receive_message(int clientSocket, char message [], int length){
    int received = 0;
    while(received < length){
        received += recv(clientSocket, message + received , length - received , 0);
    }
}
int main(){
    //SETTING UP THE SOCKET
    int welcomeSocket, clientSocket;
    struct sockaddr_in serverAddr;
    struct sockaddr_storage serverStorage;
    socklen_t addr_size;
    welcomeSocket = socket(AF_INET, SOCK_STREAM, 0);
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(7899);
    serverAddr.sin_addr.s_addr = INADDR_ANY;
    memset(serverAddr.sin_zero, '\0', sizeof serverAddr.sin_zero);
    bind(welcomeSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr));
    if (listen(welcomeSocket,1)==0)
        printf("Listening\n");
    else
        printf("Error\n");
    addr_size = sizeof serverStorage;
    
    //Accettiamo la connessione sul socket per il Client in arrivo
    clientSocket = accept(welcomeSocket, (struct sockaddr *) &serverStorage, &addr_size);

    printf("accepted");

    // Testing
    char message [3000]= "";
    memset(message, 0,3000);
    for(int i = 0 ; i< 20; i ++){
        receive_message(clientSocket, message, 3000);
        printf("i:= %d\n %s\n",i, message);
    }
}

Note: if I tell the sender to wait a few moments between each message, the the program works without a problem: what could the reason be? Am I over loading the socket by sending too many bytes at the same time?

EDIT: Included code.

Note 2: From my tests, the code works if ran locally, with client and server on the same machine. It breaks however when i try to run it on a remote server. EDIT 2: To see what i mean about "waiting": if in the sender I include a line that prints the message, then there are no problems. If instead I don't, then the server has the problems that i described above and in the comments.

Edit 3: Solution

Thanks to Serge Ballesta's answer i implemented this solution

Server side

#include <stdio.h>
#include<stdlib.h>
#include<string.h>
#include<inttypes.h>
#include <netinet/in.h>
#include<sys/socket.h>

void receive_message(int clientSocket, char message [], int length){
    int received = 0; int r =0;
    while(received < length){
        r = recv(clientSocket, message + received , length - received , 0);
        if( r<= 0){
            printf("!\n\nERROR- skip \n\n\n!!!!!!!!");
            break;
        }else{
            received += r;
        }
    }
}
int main(){
    //SETTING UP THE SOCKET
    int welcomeSocket, clientSocket;
    struct sockaddr_in serverAddr;
    struct sockaddr_storage serverStorage;
    socklen_t addr_size;
    welcomeSocket = socket(AF_INET, SOCK_STREAM, 0);
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(7899);
    serverAddr.sin_addr.s_addr = INADDR_ANY;
    memset(serverAddr.sin_zero, '\0', sizeof serverAddr.sin_zero);
    bind(welcomeSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr));
    if (listen(welcomeSocket,1)==0)
        printf("Listening\n");
    else
        printf("Error\n");
    addr_size = sizeof serverStorage;
    
    //Accettiamo la connessione sul socket per il Client in arrivo
    clientSocket = accept(welcomeSocket, (struct sockaddr *) &serverStorage, &addr_size);

    printf("accepted");

    // Testing
    char message [3000]= "";
    memset(message, 0,3000);
    for(int i = 0 ; i< 30; i ++){
        receive_message(clientSocket, message, 3000);
        printf("i:= %d\n %s\n",i, message);
    }
}

Client Side

#include <stdio.h>
#include<stdlib.h>
#include<string.h>
#include<inttypes.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include<sys/socket.h>

void send_message(int clientSocket, char message [], int length ){
    int sent = 0;
    while(sent < length){
        sent += send(clientSocket, message + sent , length - sent , 0);
    }
}

int main(){
    //SETTING UP THE SOCKET
    struct sockaddr_in serverAddr;
    socklen_t addr_size;
    int clientSocket = socket(AF_INET, SOCK_STREAM, 0);
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(7899);
    //REPLACE  SERVER_IP_ADDRESS WITH YOUR ADDRESS
    inet_pton(AF_INET,"15.161.236.166",&serverAddr.sin_addr);
    memset(serverAddr.sin_zero, '\0', sizeof serverAddr.sin_zero);
    addr_size = sizeof serverAddr;
    connect(clientSocket, (struct sockaddr *) &serverAddr, addr_size);

    // Testing
    printf("Start\n");
    char message [3000]= "";
    memset(message, 0,3000);
    for(int i = 0 ; i< 30; i ++){
        //Filling the string
        for(int j=0; j<2999; j++){
            message[j] = 'A' + (i % 26);//(rand() %20) ;
        }
        message[4] = 0;
        send_message(clientSocket, message, 3000);
    }
    shutdown(clientSocket, 2);
}

You have to take into account that `recv` can return -1 if it times out, that in your code results in decreasing the `received` counter, so you will probably never reach `length` received bytes. You have to manage these `-1`, checking `errno` and proceeding if it is equal to `ETIMEDOUT` or `EWOULDBLOCK`. — Roberto Caboni, Jun 13 '22 at 07:47
TCP has flow control included. The sender gets permission from the receiver to send a certain a transmit window size. Unless your TCP stack is broken, you cannot overflow the network. As you print the message on both ends, what do you ovserve? Are single bytes missing? Whole blocks of data? When you use Wireshark or tcpdump etc. to verify what is sent, what do you see? — Gerhardh, Jun 13 '22 at 07:47
@Gerhardh When i print from the sending end i see all the 30 messages, and when printing on the receiving end i see that the first few strings are received correctly, while from a certain point onward it just stops printing. From other tests i saw two things: 1) It loops in the receiving `while` 2) Most times, (not always) it starts to loop because it misses the last byte of one of the messages. I never used Wireshark or tcpdump so i can't answer your last question — R_marche, Jun 13 '22 at 07:58
The problem is that you only showed a small part of your real code, so we are not able to reproduce. You will have to provide a true [mre] if you want others to understand what actually happens. — Serge Ballesta, Jun 13 '22 at 08:02
@RobertoCaboni I'll check if that is the problem and I'll post an update to the question. If `recv` returns a -1 can I simply acknowledge it, and try to receive the remaining bytes? Or would I need to notify the sender and ask him to re-send the message? Anyway, thank you, I'll look into how to handle errors — R_marche, Jun 13 '22 at 08:04
@SergeBallesta I'll update with a reproducible example in a few minutes. — R_marche, Jun 13 '22 at 08:07
How does 'fill_message_string(message);' work? Does it fill 1023 bytes with non-NUL data, followed by one NUL? — Martin James, Jun 13 '22 at 08:08
Using a non-random pattern would make it much easier to see where something is missing. — Gerhardh, Jun 13 '22 at 08:20
Resending of missing packets is also done by TCP automatically. The application does not need to request any resending — Gerhardh, Jun 13 '22 at 08:20
There is no such thing as a message in TCP. If you want messages you have to implement them yourself, which means via an application protocol such that you know when you've received an entire message. This can be accomplished by various means, including: a single fixed message length for all messages; a length-word prefix in the message; an agreed message delimiter such as a newline; or a self-describing protocol such as XML. Mere coding will never solve this for you. — user207421, Jun 13 '22 at 08:21
@user207421 I know that TCP doesn't divide the stream in messages. One of the solution you mentioned is already implemented here, with all messages having the same fixed length — R_marche, Jun 13 '22 at 08:58
A return value of 0 from `recv` (when the requested length is greater than 0) indicates an end-of-file condition. There will be no more data to be received from the stream once the end-of-file condition has been reached. — Ian Abbott, Jun 13 '22 at 09:18
@IanAbbott Thanks for this explanation. Does this mean that once a single `recv` returns 0, all the next one will return 0 too? . Moreover, why would `recv` return 0, if the return value of `send` shows that all bytes have been correctly sent, and the receiving side has still to read all of them ? — R_marche, Jun 13 '22 at 09:31
Yes it does mean that. End of stream is the end. It does not mean end of message, which doesn't exist, as I stated above. It would return 0 if the peer has closed the connection, and not before. — user207421, Jun 13 '22 at 09:51
@user207421 Any ideas on what may cause the receiver to get an end of file while not all sent bytes have actually been read? — R_marche, Jun 13 '22 at 10:07
@R_marche The situation you describe is impossible. *Ergo* it didn't happen, and your observations are at fault. — user207421, Jun 13 '22 at 10:11
@user207421 Ok, thanks for informing me about the impossibility. I'm just trying to understand _where_ my observations are faulty : As of right now (and as stated in the question) I have that the sending functions all return the expected value, but on the receiving side this doesn't happen. Moreover this behaviour is semingly influenced by the timing between successive calls to the `send` function. I included the code, so my observations should be replicable — R_marche, Jun 13 '22 at 10:27
@R_marche The reason is in my comment above. Suppose you have to read 200 bytes in two 100 bytes long, with an interpacket delay of 50ms. You read the first one (100 bytes received); the following reads don't find data and return -1 (errno = EWOULDBLOCK) --> (99 bytes received!!!!). Finally you receive the last 100bytes, after which read returns 0. But the total received is 199, so you'll never exit the loop. I currently cannot write an answer, I hope this explanation is enough. — Roberto Caboni, Jun 13 '22 at 12:15
@RobertoCaboni It partially worked, so first of all, thank you. You were correct in saying that the problem arises when -1 is returned by the sender. However, just "skipping" those -1 wasn't enough to solve. — R_marche, Jun 13 '22 at 13:57
@R_marche, after each -1 you have to read errno value, and proceed if it is equal to ETIMEDOUT (blocking socket) or EWOULDBLOCK (non blocking socked). After each other error means that some failure occurred. — Roberto Caboni, Jun 13 '22 at 14:29

score 1 · Accepted Answer · answered Jun 13 '22 at 13:07

You have 2 problems in your code:

the sender (client prog) exits as soon as the last send call returns. As you try to send 20*3000 bytes, you can expect (when you use a true network and not the loopback interface on a single machine) than a number of bytes have just been queued for transfer and have not yet been received at that moment. But the end of the client program will abruptly close the socket and the queued bytes will not be sent at all
the receiver expects all the 60000 bytes to be received and never tests for an early peer closure on the socket. If it happens (and because of the problem sender side, it is to be expected), the receiver will fall in an endless loop reading 0 bytes from a socket which has already closed by the sender.

What to do:

the reciever should test for a 0 bytes read. If it happens it means that nothing will ever come from the socket and it should immediately abort with an error message. At least the problem will be easier to diagnose.
the sender should not abruptly close its socket but instead use a graceful shutdown: when everything has been sent, it should use shutdown on the socket notifying the peer that nothing more will be sent, and wait for the receiver to close the socket when everything has correctly be transmitted. It is enough to use a blocking read of a few bytes: the read will nicely block until the peer closes its socket and will then return a 0 bytes read.

Thank you for the answer! I'm now making the modification needed to test your solution. I just want to ask for a clarification: the `shutdown` function needs to also be reciprocated by the receiver? Or should the receiver wait with a simple `recv`? — R_marche, Jun 13 '22 at 14:01
@R_marche: For a one side transfert, only the sender needs to use `shutdown`. References about graceful shutdown can be found on this [MSDN page](http://msdn.microsoft.com/en-us/library/windows/desktop/ms738547%28v=vs.85%29.aspx) — Serge Ballesta, Jun 13 '22 at 15:45