3

I'm trying to rewrite malloc and calloc, my question is about the implementation of calloc, not how to use it.

One should always use calloc() instead of malloc()+memset(), because it could take advantage of copy-on-write (COW).

Some calloc's are implemented like this:

void * calloc(size_t nelem, size_t elsize)
{
    void *p;

    p = malloc (nelem * elsize);
    if (p == 0)
        return (p);

    bzero (p, nelem * elsize);
    return (p);
}

But they don't use COW at all (and they don't check overflow).

If those implementations don't call bzero(), they must assume that the mmap'ed pages they receive are zero-filled. They probably are because of security reasons, we don't want data from other processes to leak, but I can't find any standard reference about this.

Instead of using MAP_ANON, we could mmap from /dev/zero:

fd = open("/dev/zero", O_RDWR); 
a = mmap (0, 4096e4, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FILE, fd, 0);

But /dev/zero is not mandated by POSIX, and one could easily do sudo mv /dev/zero /dev/foo, breaking my implementation.

What's the correct way to efficiently re-write calloc(), respecting copy-on-write?

Bilow
  • 2,194
  • 1
  • 19
  • 34
  • 2
    The correct way to implement `calloc()` and `malloc()` both probably involves making an appropriate system call. Details are, perforce, system-specific. – John Bollinger Oct 20 '17 at 17:06
  • @JohnBollinger Which syscall for example? – Bilow Oct 20 '17 at 17:07
  • `mmap`, for example, if the allocation is large enough and `mmap` has an option which guarantees zero fill (such as `MAP_ANONYMOUS` on linux). Although I'm not convinced that this is a common optimization. – rici Oct 20 '17 at 17:25
  • In the [answer you linked](https://stackoverflow.com/a/2688522/5405361), there's a note that says with `mmap`, the kernel always scrubs the memory since you don't leave memory from other procs in it (however, this is likely only guaranteed on linux, I think). So in that case, memory should already be zero'd. You'll only need to zero the memory out if it comes from one of your pools. – Cassandra Fox Oct 20 '17 at 18:21
  • please note that your `nelem * elsize` is dangerous and a common source of security holes. You must check for overflows there (keyword: `__builtin_mul_overflow()`) – ensc Oct 21 '17 at 11:10
  • [Here's](https://stackoverflow.com/questions/17542601/anonymous-mmap-zero-filled) a similar question – Bilow Oct 22 '17 at 14:53

1 Answers1

3

Pure POSIX does not support anonymous memory mappings, and there are no lower-level interfaces than calloc to allocate zeroed memory.

Existing POSIX implementations support anonymous private memory mappings as an extension, via the MAP_ANON or MAP_ANONYMOUS flag (or historically, by mapping from /dev/zero). The kernel makes sure that the application only sees zeroed memory. (There are older interfaces as well, such as brk and sbrk, but they are difficult to use an not thread-safe.)

An implementation of the malloc family of function usually allocates larger blocks using mmap and keeps a watermark pointer for each block which indicates which part has already been allocated to the application at least once (via malloc/realloc/calloc, does not matter). calloc checks the watermark pointer before returning an allocation, and if the memory was used by the application before, it clears it. Otherwise, it is returned directly because it's known that it is fresh and thus has been cleared by the kernel.

Large blocks can be allocated directly using mmap, too. But the kernel has to clear the memory eventually as well (before using it to back mapping that triggered a copy-on-write fault), so this is only a clear win if the allocation is much larger than actually needed, and most parts are never written to.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • *Otherwise, it is returned directly because it's known that it is fresh and thus has been cleared by the kernel.* This assumption can fail if the program has a buffer overflow error or a stray pointer that writes to this yet unused space. The undefined behavior may occur later in a completely unrelated part of the program because a block returned by `calloc()` will not be completely set to all bits zero. The programmer is likely to spend a **long** time debugging this and may erroneously conclude that `calloc()` cannot be trusted until the real bug is found, if at all. – chqrlie Oct 21 '17 at 11:23
  • If there are buffer overflows, all bets are off anyway. Furthermore, heap-based buffer overflows are in fact easily detected using valgrind or Address Sanitizer (and there are proprietary tools, too). Stack-based overflows are much harder to detect (if they do not happen to corrupt the return address or stack canary). – Florian Weimer Oct 21 '17 at 11:25
  • You are absolutely right, I was just pointing to a real life example which would have solved quickly with valgrind, had the programmer been developing on a decent system. Instead of this, this bug caused a company-wide distrust of the C library and `calloc()` was disallowed. – chqrlie Oct 21 '17 at 11:35
  • Do you confirm that, if `MAP_ANON` exists, I can be sure the kernel will give me zeroed memory ? – Bilow Oct 21 '17 at 12:15
  • You need to read the documentation. I wouldn't be surprised if there is some obscure platform out there which has `MAP_ANON` and returns non-cleared memory. On Linux, if you want non-cleared memory with a kernel that supports it, you have to specify `MAP_UNINITIALIZED` as well. – Florian Weimer Oct 21 '17 at 16:18
  • Thanks. If I get it right, all implementations of `calloc()` that take advantage of copy-on-write rely on having fresh memory from kernel, otherwise they call `bzero()`. Memory is also cleared when a part of a `mmap`'ed block was already used. – Bilow Oct 22 '17 at 14:51