4

What will happen if I call select on the same open file descriptor from multiple threads?

Is this documented somewhere?

Andrew Tomazos
  • 66,139
  • 40
  • 186
  • 319
  • 1
    It can cause undefined behavior. Don't `select` on the same file descriptors concurrently. – obataku Aug 23 '12 at 04:27
  • @veer: Where is this documented? (or how do you know that?) – Andrew Tomazos Aug 23 '12 at 04:34
  • Is it safe to `select` for readability in one thread and `select` for writability in another thread? – Remy Lebeau Aug 23 '12 at 04:37
  • 3
    veer is wrong, it is perfectly legal to call select concurrently. But obviously there are inherent race conditions if both callers assume operations on a single file descriptor won't block -- only one thread will get any single byte of data. – Andy Ross Aug 23 '12 at 05:14
  • May I know what is your requirement for doing this? – Viswesn Aug 23 '12 at 06:12
  • 1
    @Andy Ross: Another interesting thing: sometimes there's a race condition anyway - e.g. select readable on a listening socket doesn't mean the client will still be attempting the connection when accept is called, so it can still block. – Tony Delroy Aug 23 '12 at 06:43
  • In earlier kernels there was a problem referred to as `Thundering herd`. When multiple threads/processes are blocked in `select` and some of the sockets become ready, all the processes/threads wake up. Only a some of them happen to process the readiness events on sockets, while others simply make an idle loop and go back to sleep in `select` again. On a busy system this could affect the overall performance. – Maksim Skurydzin Aug 23 '12 at 07:06
  • @MaximSkurydin: This is a general problem (not specific to select) related to the futex mechanism that underlies almost all of the synchronization primitives. There was a new feature added to futex to address this, see `man futex`. – Andrew Tomazos Aug 23 '12 at 07:13
  • @MaximSkurydin All processes will get the events, but the slower ones will get EWOULDBLOCK when they try the reads and the data is already gone into a faster process. – user207421 Aug 25 '12 at 00:42

2 Answers2

7

According to the POSIX 2008 select specification, there is nothing that prohibits two threads from both calling select at the same time.

It is reasonable to infer that if both threads are monitoring overlapping sets of file descriptors and some of the common file descriptors become readable or writable or have errors diagnosed, then both threads may end up with a report that the common file descriptors are ready. This cannot be guaranteed; there are timing issues to worry about, and it may depend on scheduling of the threads, etc. It also means one of the threads may end up not finding data to read on a file descriptor that it was told contained data to read, precisely because the other thread got there first. Any given byte of data will be read by just one of the threads.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
5

According to the Linux manual page, select is a thread safe function and a cancellation point.

On Linux some operating systems, one thread will successfully enter select, while the other threads would be blocked (the body of select is a critical section). Whatever descriptors are returned to the first thread, then the second thread that successfully enters select would probably wake up immediately with the same set, since select is a level-triggered interface.

Thus, you can't use select to select on multiple sets of file descriptors simultaneously on Linux those operating systems.

Linux seems to support fully re-entrant execution, demonstrated with this test program:

void * reader (void *arg) {
    int *fds = (int *)arg;
    struct timeval to = { 2, 0 };
    fd_set rfds;

    FD_ZERO(&rfds);
    FD_SET(fds[0], &rfds);

    select(fds[0]+1, &rfds, 0, 0, &to);
}

int main () {
    int sp[2];
    pthread_t t[2];
    socketpair(AF_UNIX, SOCK_STREAM, 0, sp);
    pthread_create(&t[0], 0, reader, sp);
    pthread_create(&t[1], 0, reader, sp);
    pthread_join(t[0], 0);
    pthread_join(t[1], 0);
    return 0;
}

When timing this program on Linux (mine was 2.6.43), the program returned after 2 seconds, indicating both threads entered select concurrently.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • What is the critical section per? (per process? per thread? per fd? per fd_set?) – Andrew Tomazos Aug 23 '12 at 04:57
  • @AndrewTomazos-Fathomling: Per thread, as I am only considering it in the MT safe context. – jxh Aug 23 '12 at 04:58
  • Also could you expand on how you reached the conclusion "you can't use select to select on multiple sets of file descriptors simultaneously on Linux". As long as at least one of the selects is woken up this might be sufficient for some applications. – Andrew Tomazos Aug 23 '12 at 04:58
  • If the critical section is per thread - than why would the second thread block on it, if it has its own critical section? – Andrew Tomazos Aug 23 '12 at 05:00
  • @AndrewTomazos-Fathomling: I only mean in the MT sense, there is no such thing as two threads simultaneously selecting in the same process. It is one thread, followed by another. – jxh Aug 23 '12 at 05:00
  • @AndrewTomazos-Fathomling: Per process, each thread sees the same critical section. – jxh Aug 23 '12 at 05:00
  • So if two different threads select on different fd_sets than one will block the other? – Andrew Tomazos Aug 23 '12 at 05:02
  • @AndrewTomazos-Fathomling: Yes. – jxh Aug 23 '12 at 05:02
  • Looking at the [POSIX 2008 `select`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/pselect.html) manual page, I do not see anything that justifies the assertion that two different threads selecting on different (or even the same) fd_sets will block each other. There might be something in the Linux implementation that limits it, but there's nothing I see in the POSIX standard to suggest that. – Jonathan Leffler Aug 23 '12 at 05:11
  • @JonathanLeffler: Yes, I was intentionally targeting Linux in my answer. – jxh Aug 23 '12 at 05:12
  • @JonathanLeffler: It says that select is thread-safe - so what does that imply in this case? – Andrew Tomazos Aug 23 '12 at 05:13
  • 1
    My understanding of 'thread-safe' means that two threads can both call it without interfering with each other. The `strtok()` function is an example of a non-thread safe function; it is not safe for two threads to use it concurrently. Actually, `strtok()` is worse than that; even a single-threaded program cannot have two sets of analysis using `strtok()` working at the same time. Other non-thread-safe function include gems such as `ctime()`; it returns a pointer to a data area that may be reused on subsequent calls. – Jonathan Leffler Aug 23 '12 at 05:17
  • @JonathanLeffler: What does it mean for two threads calling select specifically to "interfere with each other"? – Andrew Tomazos Aug 23 '12 at 05:25
  • Assuming that the two threads do not use the same (global) FD sets as each other, then the two threads can't interfere with each other because the function is thread-safe. If you use the same global FD sets in two threads concurrently, all hell breaks loose, but you get what you deserve. The only 'interference' in Linux appears to be that two separate threads cannot both be active inside `select` at the same time (a claim I am not convinced about; I've now scanned the referenced manual page and see nothing to support the assertion made in the answer). – Jonathan Leffler Aug 23 '12 at 05:33
  • @JonathanLeffler: Clearly noone is suggesting to use same fd_set struct from different threads, only same fd. You have defined "select thread-safety" as specified in POSIX in terms of "intefering with each other", and then used "intefering" to talk about threadsafety. My question is specifically what does POSIX saying select is thread-safe mean beyond generally that they don't "interfere". – Andrew Tomazos Aug 23 '12 at 05:37
  • If a function is thread-safe, it generally means that it does not change global memory; it only modifies local variables or data structures passed to it. Further, it will not be affected by another thread changing data. So, the thread-safe functions actions do not affect other threads, nor are they affected by other threads. If one thread is notified that fd 37 is ready for reading and that thread reads fd 37 (leaving it in a non-ready state) before another thread that is also monitoring fd 37 is notified, then the second thread probably never wakes because of the activity on fd 37. – Jonathan Leffler Aug 23 '12 at 05:41
  • 1
    @JonathanLeffler: I looked at the source, and I didn't see the lock I was looking for. A test program shows that `select` seems fully reentrant in Linux, so I either Linux has changed on this aspect since last I tried, or I mixed it up with a different OS. – jxh Aug 23 '12 at 06:51
  • @nos: Winsock does impose serialization on `select`, as was illustrated [here](http://stackoverflow.com/q/11008570/315052). The lock I was looking for was in the Linux kernel system call implementation, not the user space library interface. – jxh Aug 23 '12 at 07:24