Why does Python socket recv not wait for the full message?

Question

Below you can see a simple TCP echo server I've just coded. It uses .recv call to read a client's data.

Everything works fine, I can see messages I sent but... they could be any size. 1 byte, 10 bytes, 60 bytes, 1024 bytes. I thought that .recv call blocks the application until the buffer is filled with the data. But it doesn't!

I don't get how .recv call understands that there is nothing to read from a socket anymore and "we may return this data to the caller, anyway there is nothing to read anymore and we don't wait for the next bytes to fill the buffer".

I even read man pages of recv and read system calls.

I'd appreciate any help or useful links to read about it.

I'd like to clarify, the questions are:

How the .recv call understands that there is no more bytes to read from socket?
How to force .recv call to wait until the buffer is full?
If .recv call waits for some time (like timeout) until it returns the data, how would I be able to change this timeout to another value?

import socket

if __name__ == '__main__':
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind(('127.0.0.1', 43542))
    s.listen()

    while True:
        try:
            client, addr = s.accept()
        except Exception:
            s.close()
            break

        client.settimeout(12345)
        result = client.recv(1024)
        print('message:', result.decode('utf-8'))
        s.close()

UPD.

Client code:

import socket
import time

if __name__ == '__main__':
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('127.0.0.1', 43542))
    message = "hello!"
    for i in range(6):
        s.send(message[i].encode('utf-8'))
        time.sleep(1)
    s.close()

tcp dump output:

sudo tcpdump tcp -i lo0 -vv -K

tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
13:34:19.880009 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    localhost.61201 > localhost.43542: Flags [S], seq 24512730, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 1467040500 ecr 0,sackOK,eol], length 0
13:34:19.880063 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    localhost.43542 > localhost.61201: Flags [S.], seq 2392598781, ack 24512731, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 16287257 ecr 1467040500,sackOK,eol], length 0
13:34:19.880069 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    localhost.61201 > localhost.43542: Flags [.], seq 1, ack 1, win 6379, options [nop,nop,TS val 1467040500 ecr 16287257], length 0
13:34:19.880075 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    localhost.43542 > localhost.61201: Flags [.], seq 1, ack 1, win 6379, options [nop,nop,TS val 16287257 ecr 1467040500], length 0
13:34:19.880085 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 53)
    localhost.61201 > localhost.43542: Flags [P.], seq 1:2, ack 1, win 6379, options [nop,nop,TS val 1467040500 ecr 16287257], length 1
13:34:19.880093 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    localhost.43542 > localhost.61201: Flags [.], seq 1, ack 2, win 6379, options [nop,nop,TS val 16287257 ecr 1467040500], length 0
13:34:19.882033 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    localhost.43542 > localhost.61201: Flags [F.], seq 1, ack 2, win 6379, options [nop,nop,TS val 16287259 ecr 1467040500], length 0
13:34:19.882049 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    localhost.61201 > localhost.43542: Flags [.], seq 2, ack 2, win 6379, options [nop,nop,TS val 1467040502 ecr 16287259], length 0
13:34:20.885298 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 53)
    localhost.61201 > localhost.43542: Flags [P.], seq 2:3, ack 2, win 6379, options [nop,nop,TS val 1467041505 ecr 16287259], length 1
13:34:20.885451 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    localhost.43542 > localhost.61201: Flags [R], seq 2392598783, win 0, length 0

I see that client sends a TCP segment with PUSH flag. I found an answer for that question https://superuser.com/questions/1455476/what-does-tcp-packet-p-flag-means-in-tcpdumps-output which told me that using PUSH flag forces receiver to give that information to an application as fast as possible.

So, now I understand why server gives the data to an application. But I still can't get why server sends a FIN segment to a client. Why does he want to close a connection?

My expectation were:

Client sends data byte-by-byte sleeping for 1 second in iteration
Server wait for the data
Server prints data

UPD 2.

I'm dumb, I have to be shamed ha-ha.

Server's code is waiting for the data.
Client sends data (1 byte). With TCP PUSH flag.
Server receive the segment and found PUSH flag (which forces the net-stack to give a control to an application).
Application sees that 1 byte.
Connection is closed by the application.

So, I have to write my-own buffer on an application layer.

Thank you guys for your help.

It knows because there is nothing left in the kernel's socket receive buffer. No mystery. There is no such thing as a 'message' in TCP. It is a byte stream protocol. — user207421, Jul 07 '22 at 00:38
"*I thought that .recv call blocks the application **until the buffer is filled** with the data.*" - wherever did you get that idea from? `recv()` returns *at least* 1 byte *up to* the maximum requested. Which means, it can return *any number of bytes* in between, whatever is currently available in the socket's receive buffer. If that buffer is empty, `recv()` waits for *at least* 1 byte to arrive, and then returns what it can. So, it is your responsibility to pay attention to the return value, and call `recv()` in a loop until you have received everything you are expecting. — Remy Lebeau, Jul 07 '22 at 01:09
If you want `"hello!"` to be a single message why are you sending it a character at a time with a sleep in between? Sleeps in networking code accomplish exactly nothing except wasting time and masking other problems. — user207421, Jul 07 '22 at 10:47
@RemyLebeau thank you for the answer! I do understand that the buffer size is the amount of data I could get by the .recv call. But I can't get why the server closes a connection... I've just added a tcpdump output and client code I use to test behaviour — Emilien Vidal, Jul 07 '22 at 10:48
@user207421 it's only for the test =) I'm not using such things in production code but it helps me to understand the processes more. For example, I expected that TCP connection will not be closed by the server and the data will be sent fine, without any problems. But I found that server closes a connection and I can't get why — Emilien Vidal, Jul 07 '22 at 10:50
There is nothing about the server closing the connection unexpectedly in your question. NB Your citation about PUSH is incorrect. The PUSH flag in TCP does exactly nothing, and never has, and never can. It was included for an asynchronous-mode kernel API that has never surfaced. — user207421, Jul 07 '22 at 10:53
@user207421 I've just updated the question, everything is fine with the code. I found my mistake. Thank you for your help! — Emilien Vidal, Jul 07 '22 at 10:56
@user207421 oh. It's sad because now I want to try to forcefully send data from a client without a PUSH flag to test will the server use net-stack buffer to handle the data. — Emilien Vidal, Jul 07 '22 at 10:59
@user207421 could you please answer one more question? Is there any way to wait for any additional data on the server without giving a control to an application? Like some instructions for net-stack to store data inside internal buffers for some idle ttl (the duration after the last received byte when the control will be given to an application). — Emilien Vidal, Jul 07 '22 at 11:01
The BSD Sockets API has a MSG_RECVALL flag (possibly misspelt here). I don't know whether it shows through into Python. — user207421, Jul 07 '22 at 12:01

tdelaney · Answer 1 · 2022-07-07T00:55:39.100

-1

It depends not just on recv but also on the underlying protocol being used. Normally its TCP, so let's go with that. TCP is a stream oriented protocol. Normally you receive packets of data with no hint about more data to come. When the far side decides the conversation is over, it may send a FIN, a RESET, or just stop sending anything at all, leaving the receiver hanging.

Since the receiver has no idea whether more data is coming, it tends to pass receive data to the program as quickly as possible. Hey, its got something, may never get anything ever again, so why wait? Commonly though, receive data will build up before the program gets a chance to ask for more.

You can't wait for the buffer to be full. But you can write your own middleware that does. It could call recv(2048) and then keep reducing that byte count til it hits zero. Basically a recvall function. Its not in the standard because traditionally programmers didn't see the need. But its also common to write your own. On the other extreme, some protocol implemenations only do recv(1) because everything is being fed into a lexer.

Typically, the operating system holds a queue of data received in the kernel. The python recv calls the C library recv which buffers data from the kernel. If there is buffered data, the recv returns right away. Otherwise the C lib calls the kernel to copy in more data (or to wait until data arrives).

You can set a timeout for the socket with s.settimeout() or to control both sender and receiver you can do a bit more low level with s.setsockopt using SO_RCVTIMEO and SO_SNDTIMEO. But sockets have other modes. You can use async sockets, select, and if you don't mind a heavy lift, windows and linux have competing means of doing "overlapped" or "direct io" operations.

edited Jul 07 '22 at 00:55

answered Jul 06 '22 at 23:32

tdelaney

73,364
6
83
116

"Some protocols only do recv(1) because everything is being fed into a lexer": I'd be astonished. This would be so inefficient as to be highly unlikely, and it's not determined by 'some protocols' but by the *implementation* of a protocol. – user207421 Jul 07 '22 at 00:40
@user207421 - Then be astonished! `recv(1)` just goes to a local process buffer in the C library most of the time so its fast. You aren't going to do much better writing your own buffering than what the C lib folks did. Its no different than, say, `getc(FILE *stream)`. – tdelaney Jul 07 '22 at 00:54
@tdelaney isn't `recv()` more analogous to the POSIX `read()` call than to the C library's `fread()` call? i.e. isn't `recv()` a system call, and thus invokes a per-call overhead of communication with the kernel layer, even if it is only to retrieve a byte of data from an in-kernel socket-buffer? (that's how I understood it, anyway) – Jeremy Friesner Jul 07 '22 at 01:06
There is no such thing as `recv(1)` in the C library, with or without a 'local process buffer'. There is only `recv(fd, char*, int, int)`, which is a system call without any buffering at all. And that is why receiving one byte at a time is highly inefficient. – user207421 Jul 07 '22 at 05:02
@tdelaney thank you for the answer! I forgot about the network protocols and their implementation. I checked it out to be sure that it's all because of TCP rules, but found something strange. Could you please look into it? I've removed timeout from the server's code (they'r doing nothing as I see), also I added a client code and a tcpdump output to the question. I used a tcpdump to monitor TCP segments over lo0 (loopback) interface and found that server is closing the connection, not a client. I just can't get why and how. – Emilien Vidal Jul 07 '22 at 10:38
@user207421 "*recv is a system call without any buffering at all*" - there is a buffer, actually. It is inside the socket. The kernel receives data from the network and puts it into the socket's buffer, which `recv` then copies from into the caller's buffer. – Remy Lebeau Jul 07 '22 at 14:28
1

@JeremyFriesner - What `recv` does depends on the operating system. In Windows its an in-process call to winsock.dll or winsock2.dll. In linux, I think it would trap to the kernel to get queued recv data, which isn't _that_ bad. But more typically you'd `fdopen` a FILE object to get the in-process cache. But its hard to justify an extra caching layer in your own code for a small potential performance boost. – tdelaney Jul 07 '22 at 20:00
@RemyLebeau Agreed, there is a kernel socket buffer, but no 'local process buffer in the C library', which was the topic at hand. – user207421 Jul 08 '22 at 02:01
@tdelaney The 'performance boost' is *not* 'small' for receiving a buffer at a time rather than a byte at a time. You haven't tried this. You are just guessing, as you did about the non-existent `recv(1)` in the C library. Please stop it. – user207421 Jul 08 '22 at 02:02

Why does Python socket recv not wait for the full message?

1 Answers1

Linked