18

I've seen a few write-ups comparing select() with poll() or epoll(), and I've seen many guides discussing the actual usage of select() with multiple sockets.

However, what I can't seem to find is a comparison to a non-blocking recv() call without select(). In the event of only having 1 socket to read from and 1 socket to write to, is there any justification for using the select() call? The recv() method can be setup to not block and return an error (WSAEWOULDBLOCK) when there is no data available, so why bother to call select() when you have no other sockets to examine? Is the non-blocking recv() call much slower?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Gren Meera
  • 181
  • 1
  • 1
  • 4
  • That's probably all details of your implementation. Did you try it? – Carl Norum Oct 03 '13 at 21:17
  • What are you going to do when there's no data to read continually? Loop forever? Or use select or poll or epoll?? Regardless of blocking or nonblocking, you need to wait for the data somehow. – goji Oct 03 '13 at 21:18
  • Yep, as @Troy said, this way you will implement active wait - meaning no mercy for processor, when there is nothing to read. – zoska Oct 30 '13 at 16:40

3 Answers3

10

You wouldn't want a non-blocking call to recv without some other means for waiting for data on the socket as you poll infinitely eating up cpu time.

If you have no other sockets to examine and nothing else to do in the same thread, a blocking call to read is likely to be the most efficient solution. Although in such a situation, considering the efficiency of this is like to be premature optimisation.

These kinds of considerations only tend to come into play as the socket count increases.

Nonblocking calls are only faster in the context of handling multiple sockets on a single thread.

goji
  • 6,911
  • 3
  • 42
  • 59
  • 1
    I take it from your response and the responses below that a looping recv() is more CPU intensive, which leads me to suspect that select() in a loop actually suspends the thread then? This might be precisely the difference I was wondering about. Also, if this is true, I would suspect it only suspends for the time duration, which turns select into a blocking call? Learning this, how would you suggest a 2 socket thread (1 for input, 1 for output) in which you need them both to process at maximum speeds, but the output is only going to matter when new data is available from a queue? – Gren Meera Oct 04 '13 at 14:40
  • Prior to calling `select()` you populate it's structures (bitsets). If have data buffered ready to be written, you set the output socket in in the write bitset, the input socket would always be set in the read bitset. This is very typical select usage and there are many examples on google. – goji Oct 04 '13 at 18:55
  • 1
    This book is the sockets bible imo: http://www.amazon.com/UNIX-Network-Programming-Networking-Sockets/dp/013490012X – goji Oct 04 '13 at 18:57
  • 1
    Unfortunately all examples I've found for read/write select() threads use the timeval to block select(), which is NOT reading and writing as fast as possible. You would need to wait for select() to return before you can call it again. If you give select() a timeval of zero, I take it this is what you meant? Is giving a timeval of zero in your loop still considered kinder to the CPU than simply using the non-blocking calls since you will be calling it at the same rate? – Gren Meera Oct 04 '13 at 19:20
  • I think you misunderstand select, if select is blocking, there is no i/o to be done, so blocking is exactly what you want unless you have other calculations that need doing. select will return as soon as there is something in a socket to read, or a socket is ready to be written to. – goji Oct 04 '13 at 19:24
  • 1
    That was all within my understanding, I believe I'm stuck on something different. For example, I have data in my queue that I need to write. I call select(). It will return most likely with a write flag set, I call send(), everything is merry. Now I have nothing in my write queue, but I still wish to read. I call select in my loop again. It will block now until data comes in for reading, however I may end up with more data that I wish to send in my send queue. Unfortunately, I am blocked in select and cannot send data until either select() times out or I receive data. – Gren Meera Oct 04 '13 at 20:17
  • 2
    For more background information: My problem is that I must read as fast as possible data coming in on a socket, and I need to write as fast as I can when data is available in my process to write. This is on one thread, and it is using slightly older libraries on a machine that does not and will not support signals. The best I seem to have is select and non-blocking calls. This is where I was asking advice for. – Gren Meera Oct 04 '13 at 20:23
  • I have a somewhat similar problem. I have an application which can receive data over a single socket or via a non-socket interface. I wish to process all data with minimal latency. So creating, zero-ing, and setting an fd_set for select is costly. Before people start misquoting Knuth, I'm squeezing nanoseconds so this is not a premature optimization. So for the case of polling reads on a single socket, is a non-blocking select call equivalent to a non-blocking recv call? Are there any gotchas in the latter case? – fredbaba Dec 18 '13 at 20:50
6

If there is no data available, and you use non-blocking IO, recv() will return immediately. Then what should the program do ? You would need to call recv() in a loop until data becomes available - this just uses CPU for pretty much no reason.

Spinning on recv() and burning CPU in that manner is very undesirable; you'd rather want the process to wait until data becomes available and get woken up; that's what select()/poll() and similar does.

And, sleep() in the loop in order to not burn CPU is not a good solution either. You'd introduce high latency in the processing as the program will not be able to process data as soon as the data is available.

tshepang
  • 12,111
  • 21
  • 91
  • 136
nos
  • 223,662
  • 58
  • 417
  • 506
5

select() and friends let you design the workflow in such a way that slowness of one socket does not impede the speed at which you can serve another. Imagine that data arrives fast from the receiving socket and you want to accept it as fast as possible and store in memory buffers. But the sending socket is slow. When you've filled up the sending buffers of the OS and send() gave you EWOULDBLOCK, you can issue select() to wait on both receiving and sending sockets. select() will fall through if either new data on the receiving socket arrived, or some buffers are freed and you can write more data to the sending socket, whichever happens first.

Of course a more realistic use case for select() is when you have multiple sockets to read from and/or to write to, or when you must pass the data between your two sockets in both directions.

In fact, select() tells you when the next read or write operation on a socket is known to succeed, so if you only try to read and write when select allows you, your program will almost work even if you didn't make the sockets non-blocking! It is still unwise to do, because there exist edge cases when the next operation still may block despite select() reported that the socket as "ready".

On the other hand, making the sockets non-blocking and not using select() is almost never advisable because of the reason explained by @Troy.

Community
  • 1
  • 1
crosser
  • 717
  • 4
  • 17