10

I'm currently using gdb to see the effects of low level code. Right now I'm doing the following:

int* pointer = (int*)calloc(1, sizeof(int));

yet when I examine the memory using info proc mappings in gdb, I see the following after what I presume is the .text section (since Objfile shows the name of the binary I'm debugging):

...
Start Addr    End Addr    Size     Offset    Objfile
0x602000      0x623000    0x21000  0x0       [heap]

How come the heap is that big when all I did was allocating space for a single int?

The weirdest thing is, even when I'm doing calloc(1000, sizeof(int)) the size of the heap remains the same.

PS: I'm running Ubuntu 14.04 on an x86_64 machine. I'm compiling the source using g++ (yes, I know I shouldn't use calloc in C++, this is just a test).

trincot
  • 317,000
  • 35
  • 244
  • 286
Martin
  • 940
  • 6
  • 26
  • 9
    Why should the system speculatively allocate a tiny amount of memory when you may very well want to allocate more space later on anyway? – Kerrek SB May 29 '14 at 16:11
  • 5
    Not only MAY want more space but it is VERY likely you will need more space. – RedX May 29 '14 at 16:12
  • But why does it always allocate that exact size instead of a different value? Also, `print pointer` shows it's pointing to 0x602010, while the heap begins at 0x602000. Why is that? – Martin May 29 '14 at 16:13
  • 1
    @user3688293: A heap is a data structure like an array or a linked list, and like a linked list it has next element pointers, etc. Those have to be stored along with your data. Using small allocations like a single `int` is actually very wasteful. – Ben Voigt May 29 '14 at 16:15
  • @Redx it is not likely; on the contrary, the probability is 0. – Peter - Reinstate Monica May 29 '14 at 16:15
  • 1
    Now, the waste isn't so much as to explain why your heap starts at 132 kB. That has more to do with the cost of requesting memory from the OS. Your C library allocator grabs big chunks from the OS at once to avoid paying that cost so often. – Ben Voigt May 29 '14 at 16:16
  • Also, if you do a heap-walk, you might find that the runtime allocated some additional objects on the heap. – Deduplicator May 29 '14 at 16:18
  • 2
    What Ben's described is typical for most implementations. The runtime is **likely** to implement a *sub-allocator*, where it grabs a sizable minimum heap-chunk via system call and divvies it up with its own sub-allocation algorithm for smaller requests. Most every decent runtimes do this in one form or another, as the system heap management has historically been *expensive*. – WhozCraig May 29 '14 at 16:23
  • @PeterSchneider ok, maybe i should have been more explicit. Under normal circumstances it is normal to require more heap than that. – RedX May 29 '14 at 16:57
  • @WhozCraig: Except that it's pretty rare for library allocators to allocate these superchunks via a system heap, usually they'll go directly to the virtual memory manager, avoiding the overheap of nested heaps. – Ben Voigt May 29 '14 at 16:59
  • It's not speculative. If whole program optimisation is possible the compiler could detect that no more allocation beyond this compile time computed amount will take place (I assume it doesn't here), and allocate exactly the required amount. In fact, if nothing is done with the allocated memory the compiler may optimize away the allocation altogether. – Peter - Reinstate Monica May 29 '14 at 20:41
  • Since the operating system allocates memory in pages (size depends on the OS), allocating less than a page would waste the rest of the page anyway on the OS level. – Painted Black May 30 '14 at 09:06

2 Answers2

13

How come the heap is that big when all I did was allocating space for a single int?

I did a simple test on Linux. When one calls calloc glibc calls at some point sbrk() to get memory from OS:

(gdb) bt
#0  0x0000003a1d8e0a0a in brk () from /lib64/libc.so.6
#1  0x0000003a1d8e0ad7 in sbrk () from /lib64/libc.so.6
#2  0x0000003a1d87da49 in __default_morecore () from /lib64/libc.so.6
#3  0x0000003a1d87a0aa in _int_malloc () from /lib64/libc.so.6
#4  0x0000003a1d87a991 in malloc () from /lib64/libc.so.6
#5  0x0000003a1d87a89a in calloc () from /lib64/libc.so.6
#6  0x000000000040053a in main () at main.c:6

But glibc does not ask OS to get exactly 4 bytes that you have asked. glibc calculates its own size. This is how it is done in glibc:

  /* Request enough space for nb + pad + overhead */
  size = nb + mp_.top_pad + MINSIZE;

mp_.top_pad is by default 128*1024 bytes so it is the main reason why when you ask for 4 bytes the system allocates 0x21000 bytes.

You can adjust mp_.top_pad with call to mallopt. This is from mallopt's doc:

M_TOP_PAD

This parameter defines the amount of padding to employ when
calling sbrk(2) to modify the program break.  (The measurement
unit for this parameter is bytes.)  This parameter has an
effect in the following circumstances:

*  When the program break is increased, then M_TOP_PAD bytes
 are added to the sbrk(2) request.

In either case, the amount of padding is always rounded to a
system page boundary.

So I changed you progam and added mallopt:

#include <stdlib.h>
#include <malloc.h>
int main()
{
  mallopt(M_TOP_PAD, 1);
  int* pointer = (int*)calloc(1, sizeof(int));
  return 0;
}

I set 1 byte padding and according to doc it must be be always rounded to a system page boundary.

So this is what gdb tells me for my program:

      Start Addr           End Addr       Size     Offset objfile
        0x601000           0x602000     0x1000        0x0 [heap]

So now the heap is 4096 bytes. Exactly the size of my page:

(gdb) !getconf PAGE_SIZE
4096

Useful links:

  • Is there a performance impact from doing this? I assume it must call sbrk more oftern if you set this? – paulm May 30 '14 at 09:35
  • 1
    I think there might be a performance impact. This is from doc: `Modifying M_TOP_PAD is a trade-off between increasing the number of system calls (when the parameter is set low) and wasting unused memory at the top of the heap (when the parameter is set high).` –  May 30 '14 at 09:38
-4

Since you have mentioned, C/C++, better use the following construct:

int* pointer = new int(1);
Dr. Debasish Jana
  • 6,980
  • 4
  • 30
  • 69
  • 1
    yeah, but the OP wonders why calloc allocates so much memory, not how to make the code better. – enedil May 30 '14 at 09:39
  • This post may be helpful, http://stackoverflow.com/questions/12490534/difference-in-memory-block-layout-allocated-by-malloc-and-calloc – Dr. Debasish Jana May 30 '14 at 09:41
  • This does nothing to explain why int* pointer = (int*)calloc(1, sizeof(int)); allocates 0x21000 bytes – paulm May 30 '14 at 09:42
  • `new` is not necessarily better than `calloc` at all. The latter may instruct the OS to lazily zero-initialize memory blocks, whereas the best that can be done otherwise is to access all the memory and zero it at once, causing a bunch of page faults. – Potatoswatter May 30 '14 at 09:58