4

I am using Linux aio (io_submit() / io_getevents()) for file I/O. Since some operations do not have aio equilvalents (open(), fsync(), fallocate()), I use a worker thread that may block without impacting the main thread. My question is, should I add close() to this list?

All files are opened with O_DIRECT on XFS, but I am interested in both the general answer to the question, and on the specific answer with regard to my choice of filesystem and open mode.

Note that using a worker thread for close() is not trivial since close() is often called in cleanup paths, which aren't good places to launch a worker thread request and wait for it. So I'm hoping that close() is non-blocking in this scenario.

For this question, "blocking" means waiting on an I/O operation, or on some lock that may only be released when an I/O operation completes, but excluding page fault servicing.

Tunaki
  • 132,869
  • 46
  • 340
  • 423
Avi Kivity
  • 1,362
  • 9
  • 17
  • From a cursory reading of the code, it looks like `close()` may block for some filesystems, but not others. So I guess I won't get a definitive answer. – Avi Kivity Jul 05 '15 at 10:30

1 Answers1

2

close() may block on some filesystems. When possible, code should be written as portably as is practical. As such, you should definitely add close() to the list of calls that are called only from your blocking worker thread.

However, you mention that you often have to call close() in cleanup paths. If these are cleanup paths that execute at the termination of your application, it may not make as much of a difference even if close() does block if you call it directly.

Alternatively, what you could do would be to have a queue that is fed to a pool of workers. In glibc AIO, this is what is done for many calls. When you initialize AIO with aio_init(), glibc sets up a queue and a pool of worker threads. Every time an AIO call is made, glibc simply adds the relevant task and data to the queue. In the background, the worker threads wait on the queue and execute blocking calls and code and then perform any relevant actions.

If you really do have the need for a non-blocking close() (and other) calls, it may be to your advantage to simply setup a task queue and a thread pool and simply submit specific calls to the queue and have the thread pool execute calls as they come in.

haneefmubarak
  • 1,911
  • 1
  • 21
  • 32
  • I do have a thread pool and a queue. But queuing something requires an allocation, which may fail. This is not something you want on a cleanup path, as it can result in a file descriptor leak. – Avi Kivity Jul 06 '15 at 08:22
  • @AviKivity queuing should not require any additional allocations when you know precisely what you are sending ahead of time. In this case, you will be sending tasks to call functions whose arity and parameters are known ahead of time. Therefore, you should be able to use a suitably sized [ring buffer](https://en.wikipedia.org/wiki/Circular_buffer) for instance. Here is [another answer on SO that correctly implements a ring buffer](http://stackoverflow.com/a/215575/2334407). – haneefmubarak Jul 07 '15 at 22:29
  • I don't know how many close() calls I have in advance as my applicataion doesn't use a fixed number of files. – Avi Kivity Jul 09 '15 at 06:53
  • @AviKivity certainly. However, you **do** known what calls your application will be making. Therefore, what you can do would be to create a reasonably sized buffer (varies depending on platform), and then, if too many things hit the queue, then a few threads will just have to block until the queue gets some more space. – haneefmubarak Jul 09 '15 at 06:56