How to lazy allocate zeroed memory?

Question

From what I understand, I have to choose between calloc, which will allocate zeroed memory, and malloc, which can allocate memory on demand.

Is there a function that combines both those properties? Maybe direct call to mmap?

If it's possible, why calloc doesn't do it?

Good `calloc` implementations already avoid dirtying pages they got from the OS that are known to be already-zero. See http://stackoverflow.com/questions/2688466/why-mallocmemset-is-slower-than-calloc. You only need to do it yourself if you want that behaviour with extra alignment, because there seems to be no `calloc`-like function that takes an alignment parameter. — Peter Cordes, Apr 24 '17 at 02:41

score 7 · Accepted Answer · answered Nov 11 '11 at 10:33

There are a few mechanisms to get pre-zeroed memory from the operating system:

mmap(2)'s MAP_ANONYMOUS flag forces the contents to be initialized to zero.

The POSIX shared memory segments can also zero

shm_open(3) provides you with a file descriptor
ftruncate(2) the "file" to the size you want
mmap(2) the "file" into your address space

The memory comes pre-zeroed:

   This volume of IEEE Std 1003.1-2001 specifies that memory
   objects have initial contents of zero when created. This is
   consistent with current behavior for both files and newly
   allocated memory. For those implementations that use physical
   memory, it would be possible that such implementations could
   simply use available memory and give it to the process
   uninitialized. This, however, is not consistent with standard
   behavior for the uninitialized data area, the stack, and of
   course, files. Finally, it is highly desirable to set the
   allocated memory to zero for security reasons. Thus,
   initializing memory objects to zero is required.

It appears that this memory is zeroed at use: mm/shmem.c function shmem_zero_setup():

/**
 * shmem_zero_setup - setup a shared anonymous mapping
 * @vma: the vma to be mmapped is prepared by do_mmap_pgoff
 */     
int shmem_zero_setup(struct vm_area_struct *vma)
{   
    struct file *file;
    loff_t size = vma->vm_end - vma->vm_start;

    file = shmem_file_setup("dev/zero", size, vma->vm_flags);
    if (IS_ERR(file))
        return PTR_ERR(file);

    if (vma->vm_file)
        fput(vma->vm_file);
    vma->vm_file = file;
    vma->vm_ops = &shmem_vm_ops;
    return 0;
}

score 4 · Answer 2 · edited May 23 '17 at 12:25

4

If you're trying to emulate calloc with malloc (i.e. Use malloc but receive zeroed memory) then you can do so with memset:

foo = (char*)malloc(BLOCK_SIZE);
memset(foo,'\0',BLOCK_SIZE);

However, this is a bad idea (It's slower than calloc, see: Why malloc+memset is slower than calloc? ) and does not result in the 'lazy allocation' you refer to due to the reasons stated in Fritschy's answer.

edited May 23 '17 at 12:25

Community

1
1

answered Nov 11 '11 at 10:32

Andrew Stubbs

4,322
3
29
48

thanks for the link to the malloc+memset vs. calloc performance – MofX Feb 04 '15 at 15:21
Glibc's `calloc` knows that `mmap` gives it already-zeroed memory, so the other answers claiming that `calloc` always writes zeros itself (dirtying the pages) are wrong. But fortunately, gcc even optimizes malloc+memset to `calloc` when it can. – Peter Cordes Apr 24 '17 at 02:27

score 0 · Answer 3 · answered Oct 23 '15 at 15:11

On Linux (maybe other OS'es too - not sure though), when you do a malloc what you are allocating is virtual address space and not real address space. When you go to read from or write to that virtual address space the kernel will find available memory pages and map them to that virtual address space (they might be swap or they might be volatile memory). If you run a tool like htop it will display a few memory statistics. The big ones to look at here are the resident memory and virtual memory. Resident memory is how much memory your process actually has and virtual memory is the sum of all the memory that has been requested.

So when you call calloc its going to allocate the memory just like malloc. Then its going to write zeroes to the entire address space. These writes will cause the kernel to allocate the real address space and map it to the virtual address space.

So with this in mind, malloc itself isn't lazy rather the kernel is being lazy and not actually assigning memory pages to the process until needed.

In general if you need the memory zeroed you will have to either use something like calloc to zero it all up front or you will need to keep track of which chunks of memory have been zeroed yourself.

score 0 · Answer 4 · answered Nov 11 '11 at 10:32

calloc is identical to malloc except that, as you say, it zeroes the allocated memory, and also it it accepts two parameters - (NumberOfElements, ElementSize). So

malloc(size);

is equivalent to

calloc(1, size);

except that the latter provides zeroed memory.

What is this allocating memory "on demand" that you talk about?

Basile Starynkevitch · Answer 5 · 2017-04-24T04:35:04.930

0

calloc is conceptually the same as malloc followed by a zero-ing of the memory zone.

I don't understand why you believe that malloc is doing lazy allocation. It does not.

Ok, some Linux systems had memory overcommit, but you should not depend upon it (it is more a design bug than a feature, IMHO). Also, calloc of large memory zones is generally implemented by some call to mmap which does not require an additional clearing of that zone.

If you need something lazy, you might use mmap with MAP_NORESERVE but be careful with that.

edited Apr 24 '17 at 04:35

answered Nov 11 '11 at 10:44

Basile Starynkevitch

223,805
18
296
547

1

glibc's `calloc` knows that memory from `mmap` is already zeroed. Large allocations with `calloc` don't make your system start to swap, because it doesn't dirty the pages. gcc even optimizes `mmap` + `memset` to `calloc`. (related: why is there no `aligned_calloc` other than raw mmap, which produces pointers that can't be passed to `free`) – Peter Cordes Apr 24 '17 at 02:25

How to lazy allocate zeroed memory?

5 Answers5

Linked