4

We know that the heap is an area of demand-zero memory that begins immediately after the uninitialized data area and grows upward (toward higher addresses). By demand-zero, it means the first time the CPU touches a virtual page in heap area, the corresponding physical page will be all zeros.

If that is the case, then why there is a function calloc used to initialize the allocated memory to zero? Why do demand-zero pages need to be initialized zero again if they will be zero already when accessed?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    `calloc()` was designed to be used in all (hosted) implementations of C, not just Linux. – pmg Sep 22 '20 at 06:58
  • 1
    "[T]he heap is an area of ... memory that begins immediately after the uninitialized data area and grows upward" That is seldom true on modern systems. Two consecutive calls to `malloc` (or `calloc`) doesn't have to return two contiguous chunks of memory. – Some programmer dude Sep 22 '20 at 06:58
  • 1
    `We know that` Even *if* you knew that about your version of compiler on Linux today, but it's not guaranteed by the language. Meaning that it may not apply to other compilers or platforms or your own next build tomorrow. That's a fat chance to take. – dxiv Sep 22 '20 at 06:59
  • See https://stackoverflow.com/questions/32810779/why-memory-isnt-zero-out-from-malloc – fpmurphy Sep 22 '20 at 07:07
  • 2
    We don't know that. The concept of a heap has nothing to do with Linux, or any particular implementation of Linux. – EML Sep 22 '20 at 07:34

5 Answers5

9

Because after you've used the space and released it with free(), it might be allocated again. If you don't use calloc(), there's no guarantee that the memory will be zeroed on the second time it is used. (Calling free() does not zero the space.)

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • if it is allocated again, why we need to zero memory again? because we are going to write these memory blocks, the old content of them will be overwritten anyway.and there could be a problem if you try to read these blocks before writing them, but that's the programmers' responsibility to let their applications do not read unwritten blocks –  Sep 22 '20 at 07:45
  • 3
    When you used the memory, at least some parts of it were changed to non-zero values. If you are going to overwrite the memory with new data, you don’t need to use `calloc()`; using `malloc()` suffices. If you need the data zeroed, then `calloc()` can zero it as efficiently as you can (or, more likely, it can do it more efficiently than you can). – Jonathan Leffler Sep 22 '20 at 07:50
  • 2
    @amjad: Using `calloc` is fulfilling the programmer’s responsibility to initialize the memory. – Eric Postpischil Sep 22 '20 at 08:46
7

calloc does not necessarily have to initialize the memory to zero by itself. The description of calloc says that:

The space is initialized to all bits zero.

but it does not say that it is calloc that does this, just that the memory is initialized to zero by some mechanism. This is unlike malloc:

  1. The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

calloc guarantees that the memory is zeroed, and malloc does not. If the contents of the block are copy-on-write zero pages, then calloc may know not to zero it again and is faster than malloc + memset, as memset would not know that the memory was already zeroed (unless the compiler optimizes malloc + memset(..., 0, ...) to calloc); on the other hand if the block is reused, then calloc needs to zero it, even if the caller would not care about zeroing, therefore a malloc would be faster than calloc if no zeroing is needed, because then calloc would indeed do effectively malloc + memset

4

In short, it's more portable to set the memory to zero explicitly if that's what your application needs; and faster to leave it alone if it doesn't. You don't have to use calloc() to get zero'd memory -- you could just use malloc() and zero it yourself (e.g.,using memset). But calloc() will give you zero'd memory more quickly, if it can take advantage of a platform-specific feature to get it.

Kevin Boone
  • 4,092
  • 1
  • 11
  • 15
  • but why we need to zero memory? if the memory was freed first then allocated again, we don't need to zero memory, because we are going to write sth into memory blocks, their original content will be overwritten anyway, we just need to make sure our applications won't read those reallocated block before writting new content to them –  Sep 22 '20 at 07:49
  • 1
    @amjad -- it's up to you, as developer, to decide whether you need to zero memory as a specific step. If you're going to write all the values in that memory in your code then, of course, there's no reason to zero it. If you're working on large blocks of data, only a few elements of which will be non-zero, it's often quicker to let the platform zero it. The platform/compiler will know the quickest way to zero large memory blocks -- a naive looping through the bytes and setting each one to zero is rarely efficient. – Kevin Boone Sep 22 '20 at 07:57
0

the 'heap' memory will contain what ever trash was in memory when your program was loaded, (just like the stack)

It is up to your application to zero memory. There are a few ways to do that.

  1. calloc()
  2. a loop in your code
  3. memset()
user3629249
  • 16,402
  • 1
  • 16
  • 17
0

This is in part historical, and in part as a useful feature

We had heap with uninitialised memory (which was not zero), so more quick to get memory. Then this was discovered that this is a security problem, so much later, you get memory cleared to zero (and never memory set by other processes). Do not assume all systems will do it (especially on small embedded CPU, where CPU and bus access are expensive (time and power).

calloc is very handy, when you allocate arrays (as you see, the signature is done for arrays). Often on arrays, you want to initialize values to zero. A loop is very slow, and static had already the initialization to zero. We have two possibilities: initialized memory with calloc or uninitialised memory with malloc.

Note: malloc doesn't guarantee to give you all zeros: it could give you already allocated (from your process) and freed memory. Just now new memory given by kernel is zeroed (e.g. with brk/sbrk, which is sometime called by malloc/calloc, in case of lack of free memory in existing heap memory). These are two different allocation of memory.

Giacomo Catenazzi
  • 8,519
  • 2
  • 24
  • 32