SIGIO vs epoll for Linux sockets

Question

The socket documentation for linux (man 7 socket) says that you can set your socket to be O_ASYNC and then receive a signal when the socket is ready for read/write.

However, it seems most people use epoll instead. What is the reason for using epoll rather than this asynchronous signaling system?

score 1 · Answer 1 · answered Oct 22 '20 at 14:55

If you have a central loop where you catch all kind of events makes it very easy to write a single threaded application and you don't have to take care about all the synchronization problems which may occur if you are running with different execution contexts.

If you use a signal handler you must take care that you never call a non-reentrant function from the signal handler context. There is a list of Async-signal-safe functions you are allowed to call. And as you can see, it is a short list! As a result your signal handler can not do much, maybe only set a flag or send a message and the real work must be done "somewhere". In fact, signal handlers are very limited.

And using signal handlers in multi threaded applications is also not so easy as it looks in the first place, as the handler is per task and not per thread. Read more: signal handler function in multithreaded environment

score 0 · Answer 2 · answered Mar 20 '23 at 13:19

I think the reason is mainly historical. In the original BSD socket API, it was only possible register to receive a SIGIO (or SIGURG) signal, which did not include information identifying the file descriptor.

The select() API was designed to multiplex among multiple ready sockets, and I venture that the SIGIO interface was intended to indicate that an application's main loop should call select() (which would not block when triggered by SIGIO).

Because of limitations of select() when dealing with processes that support a large number of file descriptors (as file descriptors are held in a bitmap), Sys V introduced the poll() interface, which replaces the bitmap with an array of descriptors.

poll() itself suffers when the number of descriptors becomes large, as each call needs to copy the entire array to the kernel. Consequently, Linux introduced the epoll() interface, which allows file descriptors to be registered once, reducing per-blocking call overhead.

However, the POSIX real-time working group, building on the improved signal handling facilities of SysV, introduced real-time signals, which are queued and carry additional information. For example, both POSIX timers and POSIX asynchronous I/O indicate events (timing resp. file operation completion) via real-time signals. The additional information of the signal identifies some user-defined data or the file descriptor, which allows for efficient de-multiplexing of the event stream.

And in fact, Linux has extended the SIGIO mechanism to allow arbitrary signals and to associate additional information with queued signals.

Signal handling is tricky in POSIX though, primarily because it is not possible to direct signal delivery to a given thread. To do this, you will need to block reception of that signal to all threads except the one(s) you designate for handling that I/O. A somewhat easy way to do this is to block the signal during initialization, while the process is still single-threaded, and to unblock in exactly those threads that can handle it.

An easy way of working-around the async-signal safety issue is to receive and process signals synchronously via sigwaitinfo() or sigtimedwait().

SIGIO vs epoll for Linux sockets

2 Answers2