The io_getevents
notification mechanism looks quite capable at first glance, so I would like something I could use with it. I just couldn't find anything yet. On Windows, it's easy: There is only TransmitFile
, which can work asynchronously (overlapped) and with some notification mechanism (IOCP, event) if you want that. There must be some equivalent on Linux, right? Or, to put my question in some context, how would I create an efficient file server on Linux?
Asked
Active
Viewed 2,486 times
3

purefanatic
- 933
- 2
- 8
- 23
-
If the socket is non-blocking, than neither will `sendfile` (it will report how much data was scheduled to be sent in the socket's "buffer"). You will need to poll the socket to see when you can continue the `sendfile` operation (see `epoll`)... or better yet, use a library that does this for you. – Myst Dec 30 '17 at 14:55
-
@Myst Mh, when I think of asynchronous I/O, I think of operations that I can start at arbitrary points in time and get notified when they finish. With epoll+sendfile, I first have to wait until send buffers are available, call sendfile which will copy some amount of data to said buffers (synchronously!), rinse and repeat. – purefanatic Dec 31 '17 at 02:13
-
1Also, I read that sendfile might block even when used with non-blocking sockets, and that one can work around that using `readahead`: http://brad.livejournal.com/2228488.html This introduces even more complex application design and more latency because of the number of context switches needed before actually doing work. I don't really find the whole "non-blocking" approach satisfying. – purefanatic Dec 31 '17 at 02:14
-
it's true that `sendfile` isn't `asio`, but it does **not** copy the data synchronously to the socket's buffer (that's why it's important to set the socket to non-blocking)... actually, it doesn't even copy the data (which is part of it's optimization). From what I remember, the data is packaged directly from the file buffer. – Myst Dec 31 '17 at 02:15
-
@Myst Ok, but then the sendfile documentation is quite misleading. It clearly says "If the transfer was successful, the number of bytes written to out_fd is returned." Also, nonblocking send/recv _has_ to copy synchronously, there's just no other way. But that is not the point. For example, I would like to have multiple send operations in flight at once. I don't think this is possible using nonblocking sockets+epoll, right? With actual asynchronous I/O I could queue up some headers followed by actual file data. The OS could start sending my headers while prefetching the file data just in time. – purefanatic Dec 31 '17 at 02:46
-
I'm sorry, this is a bit of a long discussion for comments, and it might be too opinion based. At the end of the day the data needs to be packaged and handed off to the network layer. The question of who does it (your code, a system call or the OS scheduler) is mostly a question of minimizing operations and achieving maximum throughput. Also, as long as you have only one network card, concurrency is mostly out the window when the data reaches the wires... I find `epoll` more convenient and very performant, while allowing me more control. – Myst Dec 31 '17 at 03:05
-
Yeah you might be right. I find epoll rather limiting, regarding both application design and performance. I find the kernel aio API more appealing but there does not seem to be support for sendfile yet. – purefanatic Dec 31 '17 at 12:57
1 Answers
5
Alas, there is nothing easy for you on Linux and nearly anything can block in the wrong circumstances (even io_submit
). In answer to your questions (in the title and within the main text):
- (2019) There's no system provided asynchronous version of
sendfile
in Linux (Linux isn't Windows or FreeBSD). There's an excellent write-up coveringsendfile
blocking, caveats and ideas in a TANK distributed log issue. That notes lighttpd came up with an "asynchronous"sendfile
hack but it's complicated and uses threads. - It requires a different mindset to create an efficient file server on Linux in comparison to Windows. Take a look at this NGINX blog post about what they do to make things fast on Linux or this Scylla blog post about different I/O access methods for Linux for the tradeoffs involved.
Them's the breaks...
Future (2020+) solutions
There's a suggestion that some future Linux kernel (later than 5.5 as were' already up to 5.5-rc7 at the time of writing) could essentially perform an asynchronous sendfile via io_uring if io_uring gains support for splice()
...

Anon
- 6,306
- 2
- 38
- 56
-
The blog post you linked seems to suggest offloading blocking operations in userspace thread pools, which is the same thing that arvid did with [libtorrent in 2012](https://blog.libtorrent.org/2012/10/asynchronous-disk-io/) for lack of better alternatives. This is exactly what I would like to avoid at all costs in favor of some more efficient APIs. I just hoped Linux would have improved things in the course of over 6 years. I mean if FreeBSD can get it right, why would Linux not be able to? Both are POSIX, so I feel like they share similar handicaps. – purefanatic Feb 08 '18 at 11:34
-
Also I would expect `io_submit` to only block under extreme circumstances, like the I/O request queue or the completion queue filling. I mean in that case, you really _do_ have to wait for things to complete. – purefanatic Feb 08 '18 at 11:35
-
@purefanatic Lack of a great pervasive async I/O framework is just a flaw in Linux - being popular doesn't mean you're always best in every category. POSIX never mandated an async framework so that's orthogonal to this. `io_submit` can block when you do buffered read which is not cached which is not that exotic or extreme. – Anon Feb 08 '18 at 13:59
-
Thanks for the reference on `io_submit`! I think that POSIX never mandating a proper async framework is one of the exact reasons that Linux also was not designed with asynchronous I/O in mind. That's why I called it a handicap. – purefanatic Feb 08 '18 at 16:44
-
@purefanatic You're welcome. OK correction time - I should have spoken more carefully. POSIX does specify a set of AIO operations (see http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/aio.h.html ) but a) it's optional b) on Linux it is glibc that implements those functions using a thread pool c) There's no sendfile in POSIX. What I should have said was that I was there's no POSIX defined AIO framework that works for most existing operations rather than defining a few new ones that are async. – Anon Feb 08 '18 at 18:15
-
Unfortunately I'm still not convinced by io_uring's "solution" of having a pipe in the middle and splicing data to it and from it. If I understand it correctly I would need to submit a splice to the pipe, wait for it to finish, submit another splice from the pipe to the receiving end and start over if the file was too big to be spliced completely in the first call. And I'd need to allocate and deallocate a pipe or have to maintain a pool of pipes. Seems strictly worse than Winsock's TransmitFile which is able to send up to 2 GiB in a single system call. – purefanatic Mar 26 '21 at 12:43