C++ new is 64-byte aligned and equal to cache line size

Question

Possible Duplicate:
Is there any guarantee of alignment of address return by C++'s new operation?

In this program, i am printing each address returned by new for unsigned chars. Then deleting them backwards in the end.

#include "stdafx.h"
#include<stdlib.h>
void func();

int main()
{
    int i=10;
    while(i-->0)printf("loaded %i \n", (new unsigned char));
    getchar();
    unsigned char *p=new unsigned char;printf("last pointer loaded %i \n", p);
    i=10;
    while(i-->0)delete (p-=64);
    getchar();
    p+=640;
    delete p;//nearly forgot to delete this ^^
    return 0;
}

output:

enter image description here

As you can see, each new returns 64-byte aligned data.

Question: Is this 64-Byte being equal to cache-line size or just a compiler thing?

Question: Should i make my structures at mostly 64-bytes long?

Question: will this be different when i change my cpu, ram, OS or compiler?

Pentium-m, VC++ 2010 express, windows-xp

Thanks.

`malloc` is required to return a pointer suitably aligned for any type. I can't remember offhand if `new` has the same requirement, or just to be aligned for the specific type given. However, the standard makes no mention whatsoever of cache lines. — BoBTFish, Aug 10 '12 at 09:34
http://stackoverflow.com/questions/506518/is-there-any-guarantee-of-alignment-of-address-return-by-cs-new-operation `new` returns a pointer aligned for any type. — BoBTFish, Aug 10 '12 at 09:38
This question is far more complete than the one it is marked as a duplicate of. :( — jterm, Jun 10 '18 at 03:43

marko · Accepted Answer · 2012-08-10T22:52:08.757

The implementation choices for a heap manager make a lot more sense when you consider what happens after a large number of allocations and deallocations.

A call to malloc() needs to locate a block of unused block of sufficient size to allocate. It could be bigger (in which case, it could either create a free block with the difference - or waste it). A naive strategy of finding the closest size of block is called best fit. If it goes onto to create new free blocks, you could alternatively call it worst leave.

After use, the best-fit approach results in a large amounts of fragmentation, caused by small blocks that are unlikely to be ever allocated again, and the cost of searching the free blocks becomes high.

Consequently, high performance heap managers don't work like this. Instead they operate as pool allocators for various fixed block-sizes. Schemes in which the blocks are powers of 2 (e.g. 64,128,256,512...) the norm, although throwing in some intermediates is probably worthwhile too (e.g. 48,96,192...). In this scheme, malloc() and free() are both O(1) operations, and the critical sections in allocation are minimal - potentially per pool - which gets important in a multi-threaded environment.

The wasting of memory in small allocations is a much lesser evil than fragmentation, O(n) alloc\dealloc complexity and poor MT performance.

The minimum block size w.r.t. to the cache line size is one of those classic engineering trade-offs, and it's a safe bet that Microsoft did quite a bit of experimentation to arrive at 64 as their minimum. FWIW, I'm pretty sure you'll find the cache-line size of modern CPUs are bigger than that.

C++ new is 64-byte aligned and equal to cache line size

1 Answers1