Is C++ memory alignment correct or inefficient?

Question

I tested this code just trying to find out how much memory c++ actually reserved for the new operator.

#include<iostream>
using namespace std;
int main() {

  cout << "alignment of " << alignof(int) << endl;
  int *intP1 = new int;
  *intP1 = 100;
  cout << "address of intP1 " << intP1 << endl;
  int *intP2 = new int;
  *intP2 = 999;
  cout << "address of intP2 " << intP2 << endl;
  int *intP3 = new int;
  cout << "address of intP3 " << intP3 << endl;
  *intP3 = 333;

  cout << endl;
  cout << (reinterpret_cast<char *>(intP3)-reinterpret_cast<char *>(intP2)) << endl;
  cout << intP3-intP2 << endl;
  cout << endl;

  cout << *(intP1) << endl;
  cout << *(intP1+4) << endl;
  cout << *(intP1+8) << endl;
  cout << *(intP1+16) << endl;
  delete intP1;
  delete intP2;
  delete intP3;
  return 0;
}

After compiled the code with -std=c++11 flag and ran it, here is what I got from a x86_64 machine.

    alignment of int4
    address of intP1 = 0xa59010
    address of intP2 = 0xa59030
    address of intP3 = 0xa59050

    the distance of intP3 and intP2 = 32

    intP1 value = 100
    is this a padding value = 0
    intP2 value = 999
    intP3 value = 333

It seems that when using new to allocate a 4 bytes memory for an integer, it actually reserved 32 bytes block which is the total space for 8 integers. According to the explanation of the c++ alignment, for 64 bit machine, memory is aligned on 16 bytes, why the distance here is 32 bytes?

Could some one help me to sort this out? Thanks in advance.

"According to the explanation of the c++ alignment, for 64 bit machine, memory is aligned on 16 bytes" - out of interest, which explanation says that? — Steve Jessop, Nov 13 '12 at 23:37
This is closely related to a question I asked the other day that got some very interesting answers: http://stackoverflow.com/questions/13286706/malloc-vs-new-different-padding/ — hcarver, Nov 13 '12 at 23:42
@Steve Please check this article [http://software.intel.com/en-us/articles/data-alignment-when-migrating-to-64-bit-intel-architecture ]Data Alignment when Migrating to 64-Bit Intel® Architecture — Dancing_bunny, Nov 13 '12 at 23:49
Memory allocation is not detailed in the language specification; it is up to either the compiler's library or the platform operating systems. Some compiler libraries forward the request to the OS. A memory allocator on an embedded system may be completely different than an allocator on a PC/Server that has lots of memory. — Thomas Matthews, Nov 13 '12 at 23:51
@Mark I am using g++ 4.7.2 and both debug and release build gave the same result. — Dancing_bunny, Nov 13 '12 at 23:53
Some implementations will give you different results for debug and release. Release memory allocator is designed for speed/space. Debug memory allocator is designed for recovery if there is a problem. On some systems this will be the same. But usually there is some overhead for book-keeping associated with each allocated block the size and locations will depend on the implementation. — Martin York, Nov 13 '12 at 23:56
@Dancing_bunny: that article is actually about Itanium, not x86_64. Not that I'm claiming 16-alignment is never a good idea on x86_64 (it often is a good idea), just that it's not necessary when you do `new int`, and that's it's not right to say "for 64 bit machines, memory is aligned on 16 bytes". — Steve Jessop, Nov 13 '12 at 23:59

score 5 · Accepted Answer · answered Nov 13 '12 at 23:40

It has nothing to do with alignment -- it's extra overhead of how the internal memory allocator works. Typically, each memory block has extra hidden information in it at the front and/or back used for maintaining the heap's structure. Exactly how much overhead there is is will vary from platform to platform and from implementation to implementation.

For example, Doug Lea's malloc has an extra overhead of 4-8 bytes per allocation (32-bit pointers) or 8-16 bytes (64-bit pointers) and a minimum allocation size of 16 bytes (32-bit) or 32 bytes (64-bit). That means for even 1-byte allocations, the memory allocator requires a total of 16 bytes of tracking overhead.

Steve Jessop · Answer 2 · 2012-11-14T00:01:07.800

The 32-byte difference is not just for alignment. Indeed, observe that the address 0xa59010 is not 32-aligned, it's only 16-aligned. So the alignment of your addresses would not be any worse if they were only 16 bytes apart rather than 32.

Rather, the 32 byte difference is an overhead/inefficiency of the memory allocator. I suspect that the allocator:

is helpfully giving you 16-aligned addresses. This is what you need for 128-bit SSE types, so it's useful to you, but I don't know whether that's the main reason the allocator is 16-aligning, or whether it's just convenient for the allocator.
requires some space "before" the allocation for book-keeping information, which might be 16 bytes (2 pointers or a pointer and a size), but even if not it's rounded up to 16 bytes.
only requires 4 bytes for your actual data, but because of the 16 bytes of book-keeping and the 16 byte alignment, the minimum distance between allocations is 32 bytes. So there are 12 bytes of "slack space" / "internal fragmentation" / "waste" when you make a 4 byte allocation.

But that's just a guess, I haven't looked into whatever allocator you're using.

One minor detail: it's not necessarily any real overhead or inefficiency in the allocator -- rather, a typical allocator will simply select some minimum size (and increment) for allocated blocks. For example, the designer might decide that the minimum *usable* memory is 16 bytes (and go up from there in powers of two). This helps reduce the number of different block sizes needed, which can help keep book-keeping more reasonable. — Jerry Coffin, Nov 13 '12 at 23:47
@JerryCoffin: sure, any given "overhead" might be worth paying on average -- you'd hope the authors of the memory allocator have a reason to think it is. When you're in a special case where it isn't (such as making a vast number of 4 byte allocations and running out of memory) it becomes an "inefficiency" ;-) — Steve Jessop, Nov 13 '12 at 23:48

score 1 · Answer 3 · answered Nov 13 '12 at 23:48

1

Debug versions of new can add quite a bit of padding to give guard space, so that some heap corruptions can be detected. You should run it with both debug and release builds to see if there's a difference.

answered Nov 13 '12 at 23:48

Mark Ransom

299,747
42
398
622

They are the same in both debugging and release modes. – Dancing_bunny Nov 14 '12 at 01:41

score 0 · Answer 4 · answered Nov 13 '12 at 23:37

0

You have both a pointer and a memory value. This allocates 2 memory blocks, which you already stated are 16 bytes a piece. 2x16=32.

I believe this will give you the result you were looking for.

  cout << "alignment of " << alignof(int) << endl;
  int intP1 = 100;
  cout << "address of intP1 " << &intP1 << endl;
  int intP2 = 999;
  cout << "address of intP2 " << &intP2 << endl;
  int intP3 = 333;
  cout << "address of intP3 " << &intP3 << endl;

answered Nov 13 '12 at 23:37

PearsonArtPhoto

38,970
17
111
142

The pointers should be allocated on the stack, while the integers themselves would be allocated on the heap. They will not affect each other. – ProdigySim Nov 13 '12 at 23:44

Konstantin Dinev · Answer 5 · 2012-11-13T23:44:17.357

0

int4 means it occupies 4 bytes, not 4 bits. Everything that the compiler shows you is actually accurate! Here is some documentation on primitives and what the int primitive means.

Here is a tutorial on how to define a 4 bit integer.

Let me just mention that alignof is architecture dependent. In some architectures int means 16 bit int or 2 bytes instead of 4.

edited Nov 13 '12 at 23:44

answered Nov 13 '12 at 23:38

Konstantin Dinev

34,219
14
75
100

score 0 · Answer 6 · answered Nov 13 '12 at 23:48

Absolutely nothing requires the operating system to allocate the three pointers intP1, intP2 and intP3 adjacent to one another. Your code may be detecting overhead in the allocations (and of course it's a reasonable assumption that there is some) but it's not sufficient to prove that the spacing is necessarily all allocator overhead.

score 0 · Answer 7 · answered Nov 13 '12 at 23:52

The C++ standard makes no guarantees on how much overhead each heap allocation has. Regardless of alignment, the allocator typically adds extra overhead. It is very common for allocators to fulfill small allocations out of pre-sized buckets. Here it appears that the smallest bucket is 32 bytes per allocation which isn't unusual. You will seldom find allocators in the wild with buckets smaller than 16 bytes.

If you were to allocate more than 1 int, say int[2] you would probably notice that the memory size taken is identical: 32 bytes.

Note also that there are no guarantees from the C++ standard or allocators that 2 allocations of the same size be contiguous. This may be respected most of the time but should not be relied on.

I totally agreed. Even new a struct which consists of 3 pointers (the struct supposes to consume 24 Bytes), the actually memory size taken again is 32 bytes. However, 4 pointers consumes 48 Bytes which kinda of displaying the 16 Bytes alignments. — Dancing_bunny, Nov 14 '12 at 01:44

Is C++ memory alignment correct or inefficient?

7 Answers7