8

I'm trying to figure out if mmap'ing a file, and then using madvise() or posix_madvise() with MADV_WILLNEED/POSIX_MADV_WILLNEED actually triggers background async I/O for read-ahead. The man pages for madvise don't specify whether this is the case - the actual behavior of madvise is left mostly unclear, in order to allow for flexibility of the implementation.

But does any actual mainstream POSIX implementation (like Linux) actually perform async file I/O when madvise() with MADV_WILLNEED is called? I can't seem to get any reliable information about this. This question suggests it does, on Linux at least, even if it is not ideal since there is no callback mechanism.

This book excerpt claims that posix_fadvise with POSIX_FADV_WILLNEED will do asynchronous read ahead, but doesn't mention if madvise() does async read ahead.

Furthermore, it would seem that the whole concept of "read-ahead" I/O doesn't really make any sense unless it's asynchronous. If it was synchronous, it simply makes the user application block for the read-ahead, instead of later when actually reading the file, which doesn't seem like a particularly powerful optimization.

So, does madvise() with MADV_WILLNEED actually do async read-ahead on any mainstream platform (like Linux)?

Community
  • 1
  • 1
Siler
  • 8,976
  • 11
  • 64
  • 124
  • In answer to the question posed in your title, no, I don't think it reasonable to characterize `mmap()` + `madvise()` as a form of async I/O, since the kernel is not *obligated* to do anything whatsoever in response to an `madvise()` call specifying `MADV_WILLNEED` (unlike when the advice is `MADV_DONTNEED`). As for whether any system actually will perform async readahead under any circumstances, I am inclined to suppose that some will, but I cannot provide any details. – John Bollinger Jul 04 '15 at 04:19
  • A note of nitpickery: Non-blocking synchronous functions don't block, and aren't asynchronous, but are rationalised to solve the same kind of problems as asynchronous functions. – autistic Jul 04 '15 at 04:27

2 Answers2

2

With Linux you can always check the source code.

See fadvise.c:

case POSIX_FADV_WILLNEED:
    ...
    force_page_cache_readahead(mapping, f.file, start_index,
                   nrpages);
    break;

So posix_fadvise calls force_page_cache_readahead to perform readahead.

Now lets look at madvise.c:

static long madvise_willneed(...)
{
    ...
    force_page_cache_readahead(file->f_mapping, file, start, end-start);
    return 0;
}

So MADV_WILLNEED and POSIX_FADV_WILLNEED are equivalent on Linux.

Can this be called asynchronous I/O? I don't think so. Async IO usually implies there is some notification that allows you to retrieve data. Advise is just an advise: not only you do not know when data is ready to be read, but if you are too late, data might be thrown away already.

0

In general, you should assume that the m* functions won't perform async readahead.

Mark Harrison
  • 297,451
  • 125
  • 333
  • 465