2

In my understanding, TCPStream doesn't know when a complete message arrived from a client but the information arrives at a stream of bytes.

However, when I do the standard "Hello World" TCPStream example in Rust, I am reading complete HTTP messages off the stream. When I send two or more messages, they will get separated accordingly.

How is this possible?

use std::io::prelude::*;
use std::net::TcpListener;
use std::net::TcpStream;

fn main() {
    let listener = TcpListener::bind("127.0.0.1:8080").unwrap();

    for stream in listener.incoming() {
        let stream = stream.unwrap();
        handle_connection(stream);
    }
}

fn handle_connection(mut stream: TcpStream) {
    let mut buffer = [0; 512];
    stream.read(&mut buffer).unwrap();
    println!("{}", String::from_utf8_lossy(&buffer[..]));
}

When I reduce the buffer size, the HTTP messages are getting cut and newer messages start from the beginning. I would somehow assume that I have to manage ending and starting a new HTTP message myself?

pretzelhammer
  • 13,874
  • 15
  • 47
  • 98
ohboy21
  • 4,259
  • 8
  • 38
  • 66
  • 1
    Because you got lucky. – Shepmaster Sep 28 '20 at 14:04
  • Could you elaborate on that? – ohboy21 Sep 28 '20 at 14:06
  • 1
    Does this answer your question? [TCP messages arrival on the same socket](https://stackoverflow.com/questions/31056725/tcp-messages-arrival-on-the-same-socket), [TCP stream vs UDP message](https://stackoverflow.com/questions/17446491/tcp-stream-vs-udp-message) – John Kugelman Sep 28 '20 at 14:30
  • 2
    This is a common point of confusion with TCP in general; it's not Rust-specific. There are a lot more questions like these if you search for "TCP message boundaries" or other similar language-agnostic queries. – John Kugelman Sep 28 '20 at 14:32

2 Answers2

3

At a very low level, the client is using the basic Unix write(socket, buf, nbytes) operation to put the bytes into the socket, and the server is using nread = read(socket, buf, maxbytes) to pull bytes out of the socket.

When read() and write() are used with a socket, they don't guarantee any sort of behavior regarding splitting up a write() into multiple read()s or coalescing multiple write()s into one big read(). Anything could happen, so long as the bytes that the client writes will eventually be read by the server in the same order the client wrote them (assuming the network connection doesn't go down for some reason).

In your particular case, the client probably issued two write() calls that got translated in the TCP layer into two corresponding packets. The server process was waiting on a read() call. The server OS managed to wake up the server process and give it the content of the first packet before the second packet arrived or otherwise made it through the server OS. So the server process found itself in the convenient position of having exactly one complete HTTP request to handle.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
NovaDenizen
  • 5,089
  • 14
  • 28
1

The reason for the observed effects lies in the general reader behavior.

The read method on TcpStream is provided by the Read trait. Quoting the documentation for this method:

Pull some bytes from this source into the specified buffer, returning how many bytes were read.

Note the "some bytes" bit: the read method reads until either the buffer is filled to the end or the data being read is exhausted. For TcpStream, this exhaustion can occur in two cases:

  • the request was fully sent, the other side simply doesn't provide any more data;
  • or there's some network lag, and only part of the request is already here.

In your tests, it seems you were always hitting the first case: the request was already fully transferred when you call read and it fits into the buffer so it can be read to the end. However, when the buffer is too small, you will not get the whole message at once - and so you must call read again on the same stream, to get the rest of it.

In real code, you should parse the request as you read it to determine which case you've hit: did the request arrive fully, or do you have to read it again.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Cerberus
  • 8,879
  • 1
  • 25
  • 40