In Nicholas Ormrod's talk at CppCon 2016, he mentioned an insidious bug at Facebook where a single byte had been read from an uninitialized (unwritten-to) page, twice, such that there were cases where the second read returned a (nonzero) value different from the first read's value (zero).
He mentioned they used jemalloc, and I also presume they were running on Linux. jemalloc's manpage says that it always prefers mmap()
over sbrk()
.
Now, jemalloc's only mmap()
call uses the flags MAP_PRIVATE | MAP_ANONYMOUS
with the occasional inclusion of MAP_FIXED
, and in particular it doesn't use MAP_UNINITIALIZED
. This means that pages are always zero-initialized when allocated.
Additionally, even madvise()
with MADV_DONTNEED
will, for anonymous mappings, return "zero-fill-on-demand pages" for anonymous mappings, which I read as "zero-initialized pages."
My question is: How is it possible that the second read would ever return a nonzero value, causing their bug?