2

In Nicholas Ormrod's talk at CppCon 2016, he mentioned an insidious bug at Facebook where a single byte had been read from an uninitialized (unwritten-to) page, twice, such that there were cases where the second read returned a (nonzero) value different from the first read's value (zero).

He mentioned they used jemalloc, and I also presume they were running on Linux. jemalloc's manpage says that it always prefers mmap() over sbrk().

Now, jemalloc's only mmap() call uses the flags MAP_PRIVATE | MAP_ANONYMOUS with the occasional inclusion of MAP_FIXED, and in particular it doesn't use MAP_UNINITIALIZED. This means that pages are always zero-initialized when allocated.

Additionally, even madvise() with MADV_DONTNEED will, for anonymous mappings, return "zero-fill-on-demand pages" for anonymous mappings, which I read as "zero-initialized pages."

My question is: How is it possible that the second read would ever return a nonzero value, causing their bug?

Yam Marcovic
  • 7,953
  • 1
  • 28
  • 38
  • To paraphrase the video: the bug Nicholas describes happens when a page is conditionally returned to the kernel, so when you read the byte from the page, the kernel says it's uninitialized memory, so it can return `0`. However, in the second read, you *write* instead of read, so the kernel realizes that you actually need the memory and fetches the actual data stored there – Justin Sep 08 '17 at 19:07
  • This question does not provide enough context to be able to answer the question. It is lacking a [mcve]. In order to answer this question, answerers would have to watch the video – Justin Sep 08 '17 at 19:08
  • @Justin I linked to the video for extra info, but I mentioned all the details of the bug here. – Yam Marcovic Sep 08 '17 at 19:11
  • 1
    @Justin They don't write there. What he said was that the first read causes the bypassing of the write, and then the string returns to the caller which will ultimately re-read it with something other than a null-terminator. – Yam Marcovic Sep 08 '17 at 19:12
  • Undefined behavior is undefined. Kernel has nothing to do with it. – SergeyA Sep 08 '17 at 19:23
  • 1
    See [Is uninitialized local variable the fastest random number generator?](https://stackoverflow.com/q/31739792/1708801) in [my answer to that question I reference a C defect report which talks about "wobbly values"](https://stackoverflow.com/a/31746063/1708801) – Shafik Yaghmour Sep 08 '17 at 19:32

1 Answers1

1

Whatever explanation this guys is providing is completely broken (at least, in the given context). And the code has undefined behavior no matter what.

If data points to the chunk which was allocated with at least size() + 1 size, the code has undefined behavior because of the race condition (he mentions usage of threads before).

If the size of the data is less than that (for example, equal to size()) the code has undefined behavior because of out of bounds access (and race condition becomes a moot point).

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • It sounds like they're making all kinds of shortcuts at FB, but since guys like Andrei Alexandrescu (CPP Committee) are working on that code, I give them the benefit of the doubt that they know what they're doing despite triggering UB (and he touches on that earlier in the talk as well). So I'm trying to make sense of his explanation and not throw it out the window. – Yam Marcovic Sep 08 '17 at 19:42
  • Also, reading from any uninitialized storage is already undefined behavior in itself. – Yam Marcovic Sep 08 '17 at 19:44
  • @YamMarcovic that is correct, but I do not know if storage is uninitialized since I do not know how the storage was initialized in the first place. I only watched the video from the point in your link. However, sane assumption would be that it was not, so this is the third undef behavior point. – SergeyA Sep 08 '17 at 19:50