Why is select used in Linux

Question

I was going through a serial program and I observed that they use select() before using read(). Why exactly is this required. Why cant we just directly call read() and check if it fails or not ? Also why do we have to increment the file descriptor by 1 and pass it while I am passing the file descriptor set already to select()?

Example:

r=select(fd+1, &fds, NULL, NULL, &timeout); where fds already has the value of fd

A call to `read(2)` may block. Also, read about `poll(2)` syscall (which is better than `select(2)` for multiplexing purposes; read more about the *C10K problem*). — Basile Starynkevitch, Jan 27 '13 at 07:52
Both [poll(2)](http://man7.org/linux/man-pages/man2/poll.2.html) and the old [select(2)](http://man7.org/linux/man-pages/man2/select.2.html) are waiting and *mulltiplexing* on *several* file descriptors. Their role is conceptually similar (but `poll` is more C10K friendly, since able to multiplex on more than 1024 file descriptors). — Basile Starynkevitch, Jul 27 '18 at 19:42

Jonathan Leffler · Accepted Answer · 2019-09-30T17:31:18.430

47

The select() system call tells you whether there is any data to read on the file descriptors that you're interested in. Strictly, it is a question of whether a read operation on the file descriptor will block or not.

If you execute read() on a file descriptor — such as that connected to a serial port — and there is no data to read, then the call will hang until there is some data to read. Programs using select() do not wish to be blocked like that.

You also ask:

Why do we have to increment the file descriptor by 1 and pass it while I am passing the file descriptor set already to select?

That's probably specifying the size of the FD_SET. The first argument to select() is known as nfds and POSIX says:

The nfds argument specifies the range of descriptors to be tested. The first nfds descriptors shall be checked in each set; that is, the descriptors from zero through nfds-1 in the descriptor sets shall be examined.

So, to test a file descriptor n, the value in nfds must be at least n+1.

edited Sep 30 '19 at 17:31

answered Jan 27 '13 at 05:13

Jonathan Leffler

730,956
141
904
1,278

http://manpages.courier-mta.org/htmlman2/select.2.html where we have to pass nfds as the first parameter. As for the first part- I understand now – user1667307 Jan 27 '13 at 05:20
The number of file descriptors must match the number of elements in the array of file handles passed in the second parameter. Passing in a large number will possibly not fail, but can definitely result in interesting invalid memory accesses. – Pekka Jan 27 '13 at 22:25
@Jonathan Leffler any example for easy way understand select function. – Kalanidhi Jan 23 '14 at 06:06
Not really; it is not an easy function to understand. It has one of the most complex interfaces of all the system calls in Unix. You could look at [Is it necessary to reset the FD set between `select()` system calls?](http://stackoverflow.com/questions/4563577/); you could look at [Are there any platforms where using structure copy on an FD set (for `select()` or `pselect()`) causes problems?](http://stackoverflow.com/questions/2421672/). You could search for other questions on SO using '[c] select'. – Jonathan Leffler Jan 23 '14 at 06:13
It should be noted that there is also the O_NONBLOCK flag to carry out non blocking IO – Mike76 Dec 14 '16 at 14:02
@Mike76: You can use `O_NONBLOCK` but it implies that you won't use `select()` (because `O_NONBLOCK` file descriptors are always ready for reading) and it implies that you'll be using polling on the open file descriptors to find which one has work ready. A similar discussion applies to write file descriptors. Using `select()` avoids the overhead of polling. – Jonathan Leffler Dec 14 '16 at 18:42
can anyone add some backstory/reason /why/ they subtract 1 internally, causing users to /always/ add 1? What is the reason against saving users time? Are people adding 1 the minority user-group? What use scenario wouldn't you be adding 1? – nmz787 Mar 20 '18 at 00:55
@nmz787: it is the difference between the number of file descriptors and the maximum file descriptor number. The file descriptor numbers start at zero, so if the maximum file descriptor to be checked has number N, there are N+1 descriptors to check. – Jonathan Leffler Mar 20 '18 at 01:40

score 9 · Answer 2 · answered Jan 27 '13 at 05:28

9

Programs that want to continue running while also reading interactive user input¹ need to be multithreaded or they need to read input streams carefully and, specifically, conditionally.

Select(2) can be used to implement the second design pattern. It can determine whether input can be read without blocking the entire application.

^{1. Or some other input that arrives unpredictably.}

answered Jan 27 '13 at 05:28

DigitalRoss

143,651
25
248
329

And you can also use `poll(2)` – Basile Starynkevitch Jan 27 '13 at 07:53
why does this answer and the previous comment feature `(2)` ? What does this indicate? It makes me think of either an argument being passed, or "read the second definition" (like how an English dictionary can have multiple meanings/usages for the same word). – nmz787 Mar 20 '18 at 00:59
2

The number denotes a manual section: https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean – bohrax Aug 20 '18 at 11:49

score 4 · Answer 3 · answered Jan 27 '13 at 05:13

4

You use select call when you have to constantly monitor file descriptors until they get ready for some IO without blocking.

Generally used when you want the IO (eg read() )non-blocking , read the :man page

Why is select used in Linux

3 Answers3

Linked