How is none blocking IO for regular files is implemented in .Net on Linux?

Question

As far as I know all IO on regular files are always blocking in Linux (see here). However you can still do File.ReadBLAHAsync(...)/File.WriteBLAHAsync(...) or other file related stuff just fine.

Are these wrappers faking the async call just to keep them backward compatible or some how keep the sync context satisfied?

thats what i am saying, the question is about have .net makes it look like async. — SHM, Jan 29 '22 at 12:02
It looks like there is an asynchronous API for IO on Linux called `io_uring`. However, it's very new and doesn't appear to have broad support yet so is unlikely to have been picked up as a dependency in something broadly targeting many distributions like .NET. — Damien_The_Unbeliever, Feb 01 '22 at 13:55

score 2 · Accepted Answer · answered Feb 01 '22 at 21:29

It's worth pointing that there are multiple contexts at play here.

The Linux operating system

From Non-Blocking descriptors:

By default, read on any descriptor blocks if there’s no data available. The same applies to write or send. This applies to operations on most descriptors except disk files, since writes to disk never happen directly but via the kernel buffer cache as a proxy. The only time when writes to disk happen synchronously is when the O_SYNC flag was specified when opening the disk file.

Any descriptor (pipes, FIFOs, sockets, terminals, pseudo-terminals, and some other types of devices) can be put in the nonblocking mode. When a descriptor is set in nonblocking mode, an I/O system call on that descriptor will return immediately, even if that request can’t be immediately completed (and will therefore result in the process being blocked otherwise). The return value can be either of the following:

an error: when the operation cannot be completed at all

a partial count: when the input or output operation can be partially completed

the entire result: when the I/O operation could be fully completed

As explained above, the Non-Blocking descriptors will prevent pipes (or sockets, or...) from blocking continuously. They weren't meant to be used with disk files, however, because no matter if you want to read an entire file, or just a part of it, the data is there. It's not going to get there in the future, so you can start processing it right away.

Quoting your linked post:

Regular files are always readable and they are also always writeable. This is clearly stated in the relevant POSIX specifications. I cannot stress this enough. Putting a regular file in non-blocking has ABSOLUTELY no effects other than changing one bit in the file flags.

Reading from a regular file might take a long time. For instance, if it is located on a busy disk, the I/O scheduler might take so much time that the user will notice that the application is frozen.

Nevertheless, non-blocking mode will not fix it. It will simply not work. Checking a file for readability or writeability always succeeds immediately. If the system needs time to perform the I/O operation, it will put the task in non-interruptible sleep from the read or write system call. In other words, if you can assume that a file descriptor refers to a regular file, do not waste your time (or worse, other people's time) in implementing non-blocking I/O.

The only safe way to read data from or write data to a regular file while not blocking a task... consists of not performing the operation, not in that particular task anyway. Concretely, you need to create a separate thread (or process), or use asynchronous I/O (functions whose name starts with aio_). Whether you like it or not, and even if you think multiple threads suck, there are no other options.

The .NET runtime

Implements the async/await pattern to unblock the main event loop while I/O is being performed. As mentioned above:

Concretely, you need to create a separate thread (or process), or use asynchronous I/O (functions whose name starts with aio_). Whether you like it or not, and even if you think multiple threads suck, there are no other options.

The .NET threadpool will spawn additional processes as needed (ref why is .NET spawning multiple processes on Linux). So, ideally, when the .NET File.ReadAsync(...) or File.WriteAsync(...) overloads are called, the current thread (from the threadpool) will initiate the I/O operation and will then give up control, freeing it to do other work. But before it does, a continuation is placed on the I/O operation. So when the I/O device signals the operation has finished, the threadpool scheduler knows the next free thread can pick up the continuation.

To be sure, this is all about responsiveness. All code that requires the I/O to complete, will still have to wait. Although, it won't "block" the application.

Back to OS

The thread giving up control, which eventually leads to it being freed up, can be achieved on Windows:

https://learn.microsoft.com/en-us/troubleshoot/windows/win32/asynchronous-disk-io-synchronous

Asynchronous I/O hasn't been a part of Linux (for very long), the flow we have here is described at:

https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/#unix

Unix-like systems don’t expose async file IO APIs (except of the new io_uring which we talk about later). Anytime user asks FileStream to perform async file IO operation, a synchronous IO operation is being scheduled to Thread Pool. Once it’s dequeued, the blocking operation is performed on a dedicated thread.

Similar flow is suggested by Python's asyncio implementation:

asyncio does not support asynchronous operations on the filesystem. Even if files are opened with O_NONBLOCK, read and write will block.

...

The Linux kernel provides asynchronous operations on the filesystem (aio), but it requires a library and it doesn't scale with many concurrent operations. See aio.

...

For now, the workaround is to use aiofiles that uses threads to handle files.

Closing thoughts

The concept behind Linux' Non-Blocking descriptor (and its polling mechanism) is not what makes async I/O tick on Windows.

As mentioned by @Damien_The_Unbeliever there's a relatively new io_uring Linux kernel interface that allows continuation flow similar to the one on Windows. However, the following links confirm this is not yet implemented on .NET6:

Thanks! so they put a bunch of threads to be blocked for I/O processing that makes sense. however I think by 'always readable and writable` it's pointing to the fact that poll/epoll/select will always trigger an event for regular files so data it's not necessarily in kernel buffer. consider a situation where one calls a seek to a position in file and then calls read (so data is clearly not there and needs to be loaded from the disk) — SHM, Feb 02 '22 at 05:01
The data is there, on disk. In case of pipes, sockets, terminals,..., the data may not exist yet. When a socket is listening for incoming connections, it may have to wait for seconds, days or even years. What your linked post is saying is that this check for existence is redundant in case of regular files. It may take a while until the read is done, but the work can be queued without having to wait for an external event. — Funk, Feb 02 '22 at 08:05

score 0 · Answer 2 · answered Feb 01 '22 at 13:41

isAsync allows you to control whether a file should be opened for asynchronous or synchronous I/O. The default value is false, which means synchronous I/O. If you open a FileStream for synchronous I/O, but later use any of its *Async() methods, they will perform synchronous I/O (no cancellation support) on a ThreadPool, which may not scale as well as if the FileStream were open for asynchronous I/O.

using System;
using System.IO;
using System.Threading;

namespace NonBlock
{
    class ReadWrite
    {
        static async void Begin(FileStream s)
        {
            Console.WriteLine();

            try
            {
                byte[] buffer = new byte[4096];
                while (true)
                {
                    var read = await s.ReadAsync(buffer, 0, 4096);
                    Console.WriteLine($"Read {read} bytes");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
            }
        }

        static void Main(string[] args)
        {
            Console.WriteLine();
            var fs = new FileStream(
                "/proc/self/fd/0",
                FileMode.Open,
                FileAccess.Read,
                FileShare.None,
                4096, useAsync: true);

            Begin(fs);

            Thread.Sleep(5000);
            fs.Dispose();
            Console.WriteLine();
            Thread.Sleep(-1);
        }
    }
}

Thanks, but it's not about the question. – SHM Feb 03 '22 at 03:51 — SHM, Feb 03 '22 at 03:51

How is none blocking IO for regular files is implemented in .Net on Linux?

2 Answers2

The Linux operating system

The .NET runtime

Back to OS

Closing thoughts