6

Since I'm new in learning libev recently, there's a readable/writable concept in a io_watcher that I don't quite understand. For my knowledge there's a parameter in linux system programming:

O_ASYNC

A signal (SIGIO by default) will be generated when the specified file becomes readable or writable. This flag is available only for terminals and sockets, not for regular files.

So, since a regular file won't bother with readable/writable, what readable/writable really mean in socket programming? And what measure did kernel do to find out whether a socket file descriptor is readable?

Considering the everything-is-a-file philosophy, does every socket descriptor with different descriptor number actually point to the same file? If so,can I consider the readable/writable problem is caused by the synchronisation?

OK it seems that I'v asked a silly question. What I really mean is that both socket and regular file read and write via file descriptor, so why socket descriptor got a readable/writable concept but regular file doesn't. Since EJP told me that this is because the buffer and each descriptor got their own pair of buffers, here's my conclusion: readable/writable concept is for buffers, if a buffer is empty, it's unreadable, while it is full, it's unwritable. readable and writable have nothing to do with synchronisation, and since regular file don't have a buffer, it is always readable and writable.

And there are more questions: when saying receive buffer, this buffer is not the same thing in int recv(SOCKET socket, char FAR* buf, int len, int flags);, right?

  • Each connection corresponds to a file descriptor. If 10 clients connect to the socket of a server, the server sees 10 file descriptors. The `int recv()` reads the system buffer and copies the content to the user buffer passed by the application. – alvits Jul 10 '15 at 02:48
  • @alvits so even there are 10 sockets, there's only one system buffer, right? And if a socket is readable, it's readable on the system buffer or readable on the user buffer? –  Jul 10 '15 at 03:00
  • It's readable on the system buffer, hence the system call `recv()` can be called to transfer the content to user buffer. It seems you are interested to learn the system side. I'd suggest reading up on kernel modules and drivers. – alvits Jul 10 '15 at 18:36
  • You can't compare read and write behaviors of different devices. Take a keyboard for example. When no one is typing on the keyboard, there is nothing to read. But it doesn't mean it is the end of file. Regular files on the other hand has data always ready for reading. You always get to the end of file even if the file is empty. Socket behaves similarly with keyboard. If data hasn't arrived, there is nothing to read. – alvits Jul 10 '15 at 18:49
  • Thanks, clear answer. I'm checking ldd for the detail. –  Jul 13 '15 at 14:28

2 Answers2

5

This question is specifically addressed in Unix Network Programming, Volume 1: The Sockets Networking API (3rd Edition) [W. Richard Stevens, Bill Fenner, Andrew M. Rudoff] (see it here. I'll add some minor edits for enhanced readability):

Under What Conditions Is a Descriptor Ready?

[...] The conditions that cause select to return "ready" for sockets [are]:

1. A socket is ready for reading if any of the following four conditions is true:

  • The number of bytes of data in the socket receive buffer is greater than or equal to the current size of the low-water mark for the socket receive buffer. A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is ready to be read). [...]
  • The read half of the connection is closed (i.e., a TCP connection that has received a FIN). A read operation on the socket will not block and will return 0 (i.e., EOF).
  • The socket is a listening socket and the number of completed connections is nonzero. [...]
  • A socket error is pending. A read operation on the socket will not block and will return an error (–1) with errno set to the specific error condition. [...]

2. A socket is ready for writing if any of the following four conditions is true:

  • The number of bytes of available space in the socket send buffer is greater than or equal to the current size of the low-water mark for the socket send buffer and either: (i) the socket is connected, or (ii) the socket does not require a connection (e.g., UDP). This means that if we set the socket to nonblocking, a write operation will not block and will return a positive value (e.g., the number of bytes accepted by the transport layer). [...]
  • The write half of the connection is closed. A write operation on the socket will generate SIGPIPE.
  • A socket using a non-blocking connect has completed the connection, or the connect has failed.
  • A socket error is pending. A write operation on the socket will not block and will return an error (–1) with errno set to the specific error condition. [...]

3. A socket has an exception condition pending if there is out-of-band data for the socket or the socket is still at the out-of-band mark.

[Notes:]

  • Our definitions of "readable" and "writable" are taken directly from the kernel's soreadable and sowriteable macros on pp. 530–531 of TCPv2. Similarly, our definition of the "exception condition" for a socket is from the soo_select function on these same pages.

  • Notice that when an error occurs on a socket, it is marked as both readable and writable by select.

  • The purpose of the receive and send low-water marks is to give the application control over how much data must be available for reading or how much space must be available for writing before select returns a readable or writable status. For example, if we know that our application has nothing productive to do unless at least 64 bytes of data are present, we can set the receive low-water mark to 64 to prevent select from waking us up if less than 64 bytes are ready for reading.

  • As long as the send low-water mark for a UDP socket is less than the send buffer size (which should always be the default relationship), the UDP socket is always writable, since a connection is not required.

enter image description here

A related read, from the same book: TCP socket send buffer and UDP socket (pseudo) send buffer

OfirD
  • 9,442
  • 5
  • 47
  • 90
1

Readable means there is data or a FIN present in the socket receive buffer.

Writable means there is space available in the socket send buffer.

Files don't have socket send or receive buffers.

Considering the everything-is-a-file philosophy

What philosophy is that?

does every socket descriptor with different descriptor number actually point to the same file?

What file? Why would they point to the same anything? Question doesn't make sense.

I'm confused with one thing: when a socket is created, the descriptor is actually point to the receive and send buffers of the socket

It 'points to' a lot of things: a source address, a target address, a source port, a target point, a pair of buffers, a set of counters and timers, ...

not the file represent the net hardware.

There is no such thing as 'the file represent[ing] the net hardware', unless you're talking about the device driver entry in /dev/..., which is barely relevant. A TCP socket is an endpoint of a connection. It is specific to that connection, to TCP, to the source and target addresses and ports, ...

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Does this mean all socket file descriptors with different descriptor numbers actually point to the same buffer? –  Jul 09 '15 at 11:35
  • No, each socket has its own pair. – user207421 Jul 09 '15 at 12:22
  • 3
    `What philosophy is that?` Is this a serious question? https://en.wikipedia.org/wiki/Everything_is_a_file - I guess you must be a Windows guy, but if you're answering questions about sockets you should be aware that the socket interface originated in BSD and unix philosophy and terminology is going to be relevant sometimes. A socket *is* a file; what you're calling a "file" is properly termed a *regular file*, and both are accessed through *file descriptors*. –  Jul 09 '15 at 13:22
  • As for what the asker is groping for, I'm not sure but it might be something about multiple file descriptors to the same open file description - e.g. `dup`, `dups2`, etc. If you make a copy of a file descriptor referring to a socket, you will actually end up with 2 file descriptors pointing to the same socket (which means they share the buffers, the connection status, and everything else that is an attribute of the socket itself). –  Jul 09 '15 at 13:27
  • @WumpusQ.Wumbley It's just a sound bite that media loves and nobody gives a damn about when developing the system. Files are files. Non-files are not files. Pipes and sockets have file descriptors but definitely don't behave like files. Character devices are visible in the filesystem namespace, but most of them are not even readable. "Everything is a file" except memory, system calls, network devices and pretty much everything except files. – Art Jul 09 '15 at 13:39
  • it's very useful to have a general class of "things that can consume and/or generate a sequence of bytes", create a common interface (`read`, `write`) for that general category and then specialize from there into the various subclasses that offer additional operations. That design is one of the best things about unix. And the top level of that class hierarchy happens to be called "file". –  Jul 09 '15 at 13:53
  • "class hierarchy"? That doesn't sound very unix to me. There are entries in the file system that can't be read or written, but you can open then. There are file descriptors that you can get without calling `open`. There are file descriptors that can't be obtained with `open`, you can't call `read` or `write` on them and they aren't inherited over `fork`. You can modify files without ever opening a file descriptor to them. The superclass of "file like objects" in Unix that people want to shove into "everything is a file" is an empty class. It's a marketing sound bite. – Art Jul 09 '15 at 14:33
  • @WumpusQ.Wumbley A socket *descriptor* is a file *descriptor*. In Unix and derivatives. There is no reason to conclude from that that a socket is a file, and concluding that all sockets refer to the same file is merely irrational. OP's statement and question remain unclear. NB [Wikipedia is not a reliable resource](https://en.m.wikipedia.org/wiki/Wikipedia:Wikipedia_is_not_a_reliable_source). NB 2 I was a Unix guy long before I was an OS/2 guy before I was a Windows guy, and a RSTS/E guy before that. Not that it's relevant. – user207421 Jul 09 '15 at 22:59
  • 1
    Hey guys there's no need arguing for my silly question, I'v updated my question with more details. –  Jul 10 '15 at 02:12