1

I have been reading about TCP packet and how they can be split up any number of times during their voyage. I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync() until enough data is available to parse a message. BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.

Then I read that the lower layers (ethernet?, IP?) will actually re-assemble packets transparently.

My question is, in C#, am I guaranteed to receive a full "message" over TCP? In other words, if I send 32 bytes, will I necessarily receive those 32 bytes in "one-go" (one call to ReceiveAsync())? Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix?

Also, could I receive more than one message in a single call to ReceiveAsync()? Say one "protobuf message" is 32 bytes. I send 2 of them. Could I potentially receive 48 bytes in "one go" and then 16 in another?

I know this question shows up easily on google, but I can never tell if it's in the correct context (talking about the actual TCP protocol, or how C# will expose network traffic to the programmer).

Thanks.

Matthew Goulart
  • 2,873
  • 4
  • 28
  • 63
  • Whilst TCP data can be split up and take different routes during their voyage, suffice to say this is 100% transparent to applications. All you need do is receive it safe in the knowledge that it is re-assembled for you. This is NOT the case with UDP –  Jan 31 '17 at 00:59
  • 1
    @MickyD While your statements are correct, I think you're missing the point of the question. He's asking about the amount of data available to the application at any given moment. – Jonathon Reinhart Jan 31 '17 at 01:00
  • @JonathonReinhart That's why this is a comment not an answer –  Jan 31 '17 at 01:08

3 Answers3

1

TCP is a stream protocol - it transmits a stream of bytes. That's all. Absolutely no message framing / grouping is implied. In fact, you should forget that Ethernet packets or IP datagrams even exist when writing code using a TCP socket.

You may find yourself with 1 byte available, or 10,000 bytes available to read. The beauty of the (synchronous) Berkeley sockets API is that you, as an application programmer don't need to worry about this. Since you're using a length-prefixed message format (good job!) simply recv() as many bytes as you're expecting. If there are more bytes available than the application requests, the kernel will keep the rest buffered until the next call. If there are fewer bytes available than required, the thread will either block or the call will indicate that fewer bytes were received. In this case, you can simply sleep again until data is available.

The problem with async APIs is that it requires the application to track a lot more state itself. Even this Microsoft example of Asynchronous Client Sockets is far more complicated than it needs to be. With async APIs, you still control the amount of data you're requesting from the kernel, but when your async callback is fired, you then need to know the next amount of data to request.

Note that the C# async/await in 4.5 make asynchronous processing easier, as you can do so in a synchronous way. Have a look at this answer where the author comments:

Socket.ReceiveAsync is a strange one. It has nothing to do with async/await features in .net4.5. It was designed as an alternative socket API that wouldn't thrash memory as hard as BeginReceive/EndReceive, and only needs to be used in the most hardcore of server apps.

Community
  • 1
  • 1
Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • Thanks for the reply. So that means if I am using the new(er) `SocketAsyncEventArgs`, I will have to implement my own message framing (like the berkeley sockets api you mentioned)? – Matthew Goulart Jan 31 '17 at 01:19
  • More or less, yes. If you don't need a seriously massive number of parallel connections, I would first write your code to use the synchronous APIs, creating a new thread to handle each connection if necessary. Then consider using the latest async support in .NET to handle more connections in fewer threads. – Jonathon Reinhart Jan 31 '17 at 01:28
  • When you say"seriously massive", what are we talking about? I am making a (very basic) mmo game server, so think 10k + connections, all sending 2 or 3 position updates a second plus whatever other info. Before you tell me it's impossible, it's just for educational purposes, not for actual production :) – Matthew Goulart Jan 31 '17 at 01:33
1

TCP is a stream-based octet protocol. So, from the application's perspective, you can only read or write bytes to the stream.

I have been reading about TCP packet and how they can be split up any number of times during their voyage.

TCP packets are a network implementation detail. They're used for efficiency (it would be very inefficient to send one byte at a time). Packet fragmentation is done at the device driver / hardware level, and is never exposed to applications. An application never knows what a "packet" is or where its boundaries are.

I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync() until enough data is available to parse a message.

Yes. Because "message" is not a TCP concept. It's purely an application concept. Most application protocols do define a kind of "message" because it's easier to reason about.

Some application protocols, however, do not define the concept of a "message"; they treat the TCP stream as an actual stream, not a sequence of messages.

In order to support both kinds of application protocols, TCP/IP APIs have to be stream-based.

BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.

That's good. Length prefixing is much easier to deal with than the alternatives, IMO.

My question is, in C#, am I guaranteed to receive a full "message" over TCP?

No.

Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix? Also, could I receive more than one message in a single call to ReceiveAsync()?

Yes, and yes.

Even more fun:

  • You can get only part of your length prefix (assuming a multi-byte length prefix).
  • You can get any number of messages at once.
  • Your buffer can contain part of a message, or part of a message's length prefix.
  • The next read may not finish the current message, or even the current message's length prefix.

For more information on the details, see my TCP/IP .NET FAQ, particularly the sections on message framing and some example code for length-prefixed messages.

I strongly recommend using only asynchronous APIs in production; the synchronous alternative of having two threads per connection negatively impacts scalability.

Oh, and I also always recommend using SignalR if possible. Raw TCP/IP socket programming is always complex.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
0

My question is, in C#, am I guaranteed to receive a full "message" over TCP?

No. You will not receive a full message. A single send does not result in a single receive. You must keep reading on the receiving side until you have received everything you need.

See the example here, it keeps the read data in a buffer and keeps checking to see if there is more data to be read:

private static void ReceiveCallback(IAsyncResult ar)
{
    try
    {
        // Retrieve the state object and the client socket 
        // from the asynchronous state object.
        StateObject state = (StateObject)ar.AsyncState;
        Socket client = state.workSocket;
        // Read data from the remote device.
        int bytesRead = client.EndReceive(ar);
        if (bytesRead > 0)
        {
            // There might be more data, so store the data received so far.
            state.sb.Append(Encoding.ASCII.GetString(state.buffer, 0, bytesRead));
            //  Get the rest of the data.
            client.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0,
                new AsyncCallback(ReceiveCallback), state);
        }
        else
        {
            // All the data has arrived; put it in response.
            if (state.sb.Length > 1)
            {
                response = state.sb.ToString();
            }
            // Signal that all bytes have been received.
            receiveDone.Set();
        }
    }
    catch (Exception e)
    {
        Console.WriteLine(e.ToString());
    }
}

See this MSDN article and this article for more details. The 2nd link goes into more details and it also has sample code.

CodingYoshi
  • 25,467
  • 4
  • 62
  • 64
  • You *may* nor receive a full message, and a single send does not *necessarily* correspond to a single receive. – user207421 May 29 '21 at 07:49