It's worth pointing that there are multiple contexts at play here.
The Linux operating system
From Non-Blocking descriptors:
By default, read on any descriptor blocks if there’s no data
available. The same applies to write or send. This applies to
operations on most descriptors except disk files, since writes to disk
never happen directly but via the kernel buffer cache as a proxy. The
only time when writes to disk happen synchronously is when the O_SYNC
flag was specified when opening the disk file.
Any descriptor (pipes, FIFOs, sockets, terminals, pseudo-terminals,
and some other types of devices) can be put in the nonblocking mode.
When a descriptor is set in nonblocking mode, an I/O system call on
that descriptor will return immediately, even if that request can’t be
immediately completed (and will therefore result in the process being
blocked otherwise). The return value can be either of the following:
- an error: when the operation cannot be completed at all
- a partial count: when the input or output operation can be partially completed
- the entire result: when the I/O operation could be fully completed
As explained above, the Non-Blocking descriptors will prevent pipes (or sockets, or...) from blocking continuously. They weren't meant to be used with disk files, however, because no matter if you want to read an entire file, or just a part of it, the data is there. It's not going to get there in the future, so you can start processing it right away.
Quoting your linked post:
Regular files are always readable and they are also always writeable.
This is clearly stated in the relevant POSIX specifications. I cannot
stress this enough. Putting a regular file in non-blocking has
ABSOLUTELY no effects other than changing one bit in the file flags.
Reading from a regular file might take a long time. For instance, if
it is located on a busy disk, the I/O scheduler might take so much
time that the user will notice that the application is frozen.
Nevertheless, non-blocking mode will not fix it. It will simply not
work. Checking a file for readability or writeability always succeeds
immediately. If the system needs time to perform the I/O operation, it
will put the task in non-interruptible sleep from the read or write
system call. In other words, if you can assume that a file descriptor
refers to a regular file, do not waste your time (or worse, other
people's time) in implementing non-blocking I/O.
The only safe way to read data from or write data to a regular file
while not blocking a task... consists of not performing the operation,
not in that particular task anyway. Concretely, you need to create a separate thread (or process), or use asynchronous I/O (functions whose
name starts with aio_). Whether you like it or not, and even if you
think multiple threads suck, there are no other options.
The .NET runtime
Implements the async/await
pattern to unblock the main event loop while I/O is being performed. As mentioned above:
Concretely, you need to create a separate thread (or process), or use
asynchronous I/O (functions whose name starts with aio_). Whether you
like it or not, and even if you think multiple threads suck, there are
no other options.
The .NET threadpool will spawn additional processes as needed (ref why is .NET spawning multiple processes on Linux). So, ideally, when the .NET File.ReadAsync(...)
or File.WriteAsync(...)
overloads are called, the current thread (from the threadpool) will initiate the I/O operation and will then give up control, freeing it to do other work. But before it does, a continuation is placed on the I/O operation. So when the I/O device signals the operation has finished, the threadpool scheduler knows the next free thread can pick up the continuation.
To be sure, this is all about responsiveness. All code that requires the I/O to complete, will still have to wait. Although, it won't "block" the application.
Back to OS
The thread giving up control, which eventually leads to it being freed up, can be achieved on Windows:
https://learn.microsoft.com/en-us/troubleshoot/windows/win32/asynchronous-disk-io-synchronous
Asynchronous I/O hasn't been a part of Linux (for very long), the flow we have here is described at:
https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/#unix
Unix-like systems don’t expose async file IO APIs (except of the new
io_uring
which we talk about later). Anytime user asks FileStream
to
perform async file IO operation, a synchronous IO operation is being
scheduled to Thread Pool. Once it’s dequeued, the blocking operation
is performed on a dedicated thread.
Similar flow is suggested by Python's asyncio
implementation:
asyncio does not support asynchronous operations on the filesystem.
Even if files are opened with O_NONBLOCK, read and write will block.
...
The Linux kernel provides asynchronous operations on the filesystem
(aio), but it requires a library and it doesn't scale with many
concurrent operations. See aio.
...
For now, the workaround is to use aiofiles that uses threads to handle
files.
Closing thoughts
The concept behind Linux' Non-Blocking descriptor (and its polling mechanism) is not what makes async
I/O tick on Windows.
As mentioned by @Damien_The_Unbeliever there's a relatively new io_uring Linux kernel interface that allows continuation flow similar to the one on Windows. However, the following links confirm this is not yet implemented on .NET6: