27

I've seen a number of questions regarding send() that discuss the underlying protocol. I'm fully aware that for TCP any message may be broken up into parts as it's sent and there's no guarantee that the receiver will get the message in one atomic operation. In this question I'm talking solely about the behavior of the send() system call as it interacts with the networking layer of the local system.

According to the POSIX standard, and the send() documentation I've read, the length of the message to be sent is specified by the length argument. Note that: send() sends one message, of length length. Further:

If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does not have O_NONBLOCK set, send() shall block until space is available. If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does have O_NONBLOCK set, send() shall fail.

I don't see any possibility in this definition for send() to ever return any value other than -1 (which means no data is queued in the kernel to be transmitted) or length, which means the entire message is queued in the kernel to be transmitted. I.e., it seems to me that send() must be atomic with respect to locally queuing the message for delivery in the kernel.

  1. If there is enough room in the socket queue in the kernel for the entire message and no signal occurs (normal case), it's copied and returns length.
  2. If a signal occurs during send(), then it must return -1. Obviously we cannot have queued part of the message in this case, since we don't know how much was sent. So nothing can be sent in this situation.
  3. If there is not enough room in the socket queue in the kernel for the entire message and the socket is blocking, then according to the above statement send() must block until space becomes available. Then the message will be queued and send() returns length.
  4. If there is not enough room in the socket queue in the kernel for the entire message and the socket is non-blocking, then send() must fail (return -1) and errno will be set to EAGAIN or EWOULDBLOCK. Again, since we return -1 it's clear that in this situation no part of the message can be queued.

Am I missing something? Is it possible for send() to return a value which is >=0 && <length? In what situation? What about non-POSIX/UNIX systems? Is the Windows send() implementation conforming with this?

MadScientist
  • 92,819
  • 9
  • 109
  • 136
  • 1
    There appears to be some ambiguity. Although `send()` is supposed to be equivalent to `sendto()` if the socket refers to a connection-mode socket, POSIX also says `send()` is equivalent to `write()` if `flags` is 0. A non-blocking or interrupted blocking `write()` is allowed to return a short value. – jxh Oct 31 '13 at 02:29

6 Answers6

14

Your point 2 is over-simplified. The normal condition under which send returns a value greater than zero but less than length (note that, as others have said, it can never return zero except possibly when the length argument is zero) is when the message is sufficiently long to cause blocking, and an interrupting signal arrives after some content has already been sent. In this case, send cannot fail with EINTR (because this would prevent the application from knowing it had already successfully sent some data) and it cannot re-block (since the signal is interrupting, and the whole point of that is to get out of blocking), so it has to return the number of bytes already sent, which is less than the total length requested.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • I really appreciated EJP's knowledgeable comments about implementations and how they behave but I think this answer most directly and specifically addresses my question. – MadScientist Nov 06 '13 at 17:12
6
  1. According to the Posix specification and all the man 2 send pages I have ever seen in 30 years, yes, send() can return any value > 0 and <= length. Note that it cannot return zero.

  2. According to a discussion a few years ago on news:comp.protocols.tcp-ip where all the TCP implementors are, a blocking send() won't actually return until it has transferred all the data to the socket send buffer: in other words, the return value is either -1 or length. It was agreed that this was true of all known implementations, and also true of write(), writev(), sendmsg(), writev(),

user207421
  • 305,947
  • 44
  • 307
  • 483
  • 2
    For #2, signal interrupt may trigger short write return value. – jxh Oct 31 '13 at 02:23
  • Thanks EJP, but your comment #1 is not quite answering my question. Certainly the documentation says it _can_ return these values, there's no debate about that. But I don't see any situation in which it's _possible_ to return any value other than `-1` or _length_. When does it happen? – MadScientist Oct 31 '13 at 02:41
  • Ah. I think jxh has it with the first comment. If you make this an answer I'll accept it. In my #2 I was making an apparently bad assumption that on a signal it would return `-1`. I'll bet that's not true: I'll bet (although the POSIX spec doesn't say this explicitly) that if the return value is – MadScientist Oct 31 '13 at 02:42
  • jxh: I saw that SO answer. I just don't agree with it, at least in any remotely conforming implementation. IMO the standard is very clear on this: a send() on a blocking socket will _block_ if there's not enough space to queue the message. It will _NOT_ return 0. And it will fail (return `-1`) on non-blocking sockets without enough space. – MadScientist Oct 31 '13 at 02:50
  • You can get -1 with `errno=EINTR`, in which case the usual thing is to keep looping. Java for example does this, as did my Cobol implementations decades ago. – user207421 Oct 31 '13 at 04:09
  • 1
    @jxh I'm sorry but I just don't believe anyone who says they have seen `send()` return zero. I've never seen it, or coded against it, in 30 years, including some very widely used Cobol runtime systems, except in the case where the length parameter was zero. – user207421 Oct 31 '13 at 04:10
  • 4
    @MadScientist: `errno` is **not** set (to `EINTR` or anything else meaningful) if a signal interrupts the transfer after some data is sent. The `EINTR` error condition only applies if a signal interrupts the operation *before any data is transferred*. In all other cases it is not an "error" but simply a "short send" (shorter than the full length). – R.. GitHub STOP HELPING ICE Oct 31 '13 at 05:22
  • 1
    @EJP Are you saying that the [section on `sendall`](http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall) in Beej's famous "Guide to Network Programming" is wrong or at least they are slightly misinformed? They tell of the necessity to make sure all data is send with possibly repeating calls to `send` until all data is sent, which seems to contradict what you outline in your point 2. Or I have missed something? – Armen Michaeli Nov 16 '16 at 13:20
  • I've got a network `strace` on a rhel 6 host that shows a `sendmsg` which sent partial data. Of 8 `iovec` "buffers" in the call, it sent only 2 and returned that amount in bytes. Has the logic from your second point changed? – Sotirios Delimanolis Jan 04 '17 at 05:49
  • "According to a discussion a few years ago on news:comp.protocols.tcp-ip where all the TCP implementors are" - comments like that are funny. Like, all TCP implementors who ever live? I thought TCP implementors read RFCs and POSIX? – pfalcon Feb 06 '19 at 11:43
  • @pfalcon Obviously the comment was retrospective. Posix mandates the behaviour I described. The RFCs don't describe system calls at all. – user207421 May 10 '21 at 12:33
  • 1
    @amn I am reporting the concensus of TCP/IP implementors in the newgroups I cited. I think that carries some weight. NB your link broken: currently [here](https://beej.us/guide/bgnet/html/). – user207421 Aug 27 '21 at 02:42
3

I know how the thing works on Linux, with the GNU C Library. Point 4 of your question reads differently in this case. If you set the flag O_NONBLOCK for the file descriptor, and if it is not possible to queue the entire message in the kernel atomically, send() returns the number of bytes actually sent (it can be between 1 and length), and errno is set to EWOULDBLOCK.

(With a file descriptor working in the blocking mode, send() would block.)

Pranav Singh
  • 17,079
  • 30
  • 77
  • 104
2

It is possible for send() to return a value >= 0 && < length. This could happen if the send buffer has less room than the length of the message upon a call to send(). Similarly, if the current receiver window size known to the sender is smaller than the length of the message, only part of the message may be sent. Anecdotally, I've seen this happen on Linux through the a localhost connection when the receiving process was slow to unload the data it was receiving from its receive buffer.

My sense is that one's actual experience will vary a good bit by implementation. From this Microsoft link, it's clear that a non-error return value less than the length can occur.

It is also possible to get a return value of zero (again, at least with some implementations) if a zero-length message is sent.

This answer is based on my experience, as well as drawing upon this SO answer particularly.

Edit: From this answer and its comments, evidently an EINTR failure may only result if the interruption comes before any data is sent, which would be another possible way to get such a return value.

Community
  • 1
  • 1
David Duncan
  • 1,225
  • 8
  • 14
  • Oops, I'm a bit late with my SO link, which I see jxh noted in a comment. But that's the best link I found as well. – David Duncan Oct 31 '13 at 02:50
  • I'm talking about a TCP socket (AF_INET/SOCK_STREAM): how can the receiver's window size impact the local send()? Maybe if you're using AF_UNIX or something it's different. If the send buffer does not have enough room then the send() must block (for blocking sockets). I agree with EJP's answer here: I'd be pretty surprised if implementors got this wrong. – MadScientist Oct 31 '13 at 03:07
  • The receiver's window size is part of the TCP header, so this information goes back to the sender. I was assuming that this is relevant to send(), but...I actually don't know exactly how that info is used. I see your issue with how the behavior seems defined for both non-blocking and blocking sockets, but I think there are things that can happen while in the process of sending, such as interruptions. Sorry and good question; perhaps someone will come along with a better answer. – David Duncan Oct 31 '13 at 03:34
  • I understand about the TCP header but that is 'way down in the protocol layer. I'm talking about much farther up the stack, at the sender's system call interface between userspace and the kernel. As far as I'm aware it's not the case that the receiver's TCP receive window directly impacts that interface. – MadScientist Nov 01 '13 at 15:09
2

On a 64-bit Linux system:

sendto(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4294967296, 0, NULL, 0) = 2147479552

So, even trying to send lowy 4GB, Linux chickens out and sends less than 2GB. So, if you think that you'll ask it to send 1TB and it patiently will sit there, keep wishing.

Similarly, on an embedded system with just a few KBs free, don't think that it'll fail or will wait for something - it'll send as much as it can, and tell you how much that was, letting you to retry with the rest (or do something else in the meantime).

Everyone agrees that in case of EINTR, there can be a short send. But EINTR can happen at any time, so there can always be a short send.

And finally, POSIX says that the number of bytes sent is returned, period. And whole Unix and POSIX which formalizes it is built on the concept of short read/writes, which allows implementations of POSIX systems to scale from the tiniest embedded to supercomputers with proverbial "bigdata". So, no need to try to read between the lines and find indulgences to a particular adhoc implementation you have on your hands. There're many more implementations out there, and as long as you follow the word of the standard, your app will be portable among them.

pfalcon
  • 6,724
  • 4
  • 35
  • 43
  • "And finally, POSIX says that the number of bytes sent is returned, period." From man 2 write: "According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined;" ... "On Linux, write() (and similar system calls) will transfer at most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes actually transferred. (This is true on both 32-bit and 64-bit systems.)" – user2973 Sep 11 '21 at 20:19
0

To clarify a little, where it says:

shall block until space is available.

there are several ways to wake up from that block/sleep:

  • Enough space becomes available.
  • A signal interrupts the current blocking operation.
  • SO_SNDTIMEO is set for the socket and the timeout expires.
  • Other, e.g. the socket is closed in another thread.

So things end up thus:

  1. If there is enough room in the socket queue in the kernel for the entire message and no signal occurs (normal case), it's copied and returns length.
  2. If a signal occurs during send(), then it must return -1. Obviously we cannot have queued part of the message in this case, since we don't know how much was sent. So nothing can be sent in this situation.
  3. If there is not enough room in the socket queue in the kernel for the entire message and the socket is blocking, then according to the above statement send() must block until space becomes available. Then the message will be queued and send() returns length. Then send() can be interrupted by a signal, the send timeout can elapse,... causing a short send/partial write. Reasonable implementations will return -1 and set errno to an adequate value if nothing was copied to the send buffer.
  4. If there is not enough room in the socket queue in the kernel for the entire message and the socket is non-blocking, then send() must fail (return -1) and errno will be set to EAGAIN or EWOULDBLOCK. Again, since we return -1 it's clear that in this situation no part of the message can be queued.
ninjalj
  • 42,493
  • 9
  • 106
  • 148