70

following excerpted from here

pw = (widget *)malloc(sizeof(widget));

allocates raw storage. Indeed, the malloc call allocates storage that's big enough and suitably aligned to hold an object of type widget

also see fast pImpl from herb sutter, he said:

Alignment. Any memory Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee

I am curious about this, how does malloc know alignment of the custom type?

Mat
  • 202,337
  • 40
  • 393
  • 406
Chang
  • 3,953
  • 2
  • 30
  • 43
  • 6
    new and malloc, by default, align address to 8 bytes (x86) or 16 bytes (x64), which is the optimal for most complex data. Also is sizeof() duty to get the correct size struct **with** internal padding for alignment, if necessary. – Alex Byrth May 01 '16 at 02:54

7 Answers7

61

Alignment requirements are recursive: The alignment of any struct is simply the largest alignment of any of its members, and this is understood recursively.

For example, and assuming that each fundamental type's alignment equals its size (this is not always true in general), the struct X { int; char; double; } has the alignment of double, and it will be padded to be a multiple of the size of double (e.g. 4 (int), 1 (char), 3 (padding), 8 (double)). The struct Y { int; X; float; } has the alignment of X, which is the largest and equal to the alignment of double, and Y is laid out accordingly: 4 (int), 4 (padding), 16 (X), 4 (float), 4 (padding).

(All numbers are just examples and could differ on your machine.)

Therefore, by breaking it down to the fundamental types, we only need to know a handful of fundamental alignments, and among those there is a well-known largest. C++ even defines a type max_align_t whose alignment is that largest alignment.

All malloc() needs to do is to pick an address that's a multiple of that value.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 3
    The key thing to point out is that this doesn't include custom `align` directives to the compiler that might over-align data. – user541686 Aug 28 '13 at 04:55
  • 4
    Although if you use these you are already outside the scope of the standard, please note that memory allocated in this way probably *won't* meet the alignment requirements for built types such as _m256 that are available as extensions on some platforms. – jcoder Aug 28 '13 at 07:25
  • 6
    What happens when you specify a custom alignment via `alignas` that is larger than the largest alignment of a primitive datatype? – Curious Feb 09 '17 at 15:25
  • 2
    @Curious: Support for extended alignment is implementation-defined. – Kerrek SB Feb 10 '17 at 01:27
  • `std::max_align_t` is largest alignment of *scalar types*, so a struct or a class can potentially have stricter alignment requirement than `std::max_align_t`. – Mikhail Vasilyev Jun 14 '18 at 14:45
  • @MikhailVasilyev: Yes, but only if given an `alignas`, right? Otherwise a UDT's alignment is just made up recursively of the alignment of the members. – Kerrek SB Jun 14 '18 at 22:42
  • @KerrekSB Yes, but some widely-used classes from the standard library might turn out to be over-aligned on some platforms. F.ex. see [this issue](https://github.com/cameron314/concurrentqueue/issues/64). Also virtual classes include a pointer to a virtual table so their alignment is determined not only by alignment of the members. – Mikhail Vasilyev Jun 15 '18 at 11:49
  • 4
    `malloc` has no information on the type it is allocating for; the only parameter is the size of the allocated memory. The man page states it correctly: the allocated memory is aligned such that it can be used for for any data types, i.e. the alignment is the same for all types. – Frédéric Dumont Sep 14 '19 at 23:20
  • @Massimo: I'm not sure there's a contradicton here. I first explained how alignment of a type is defined, and then how malloc can return fundamentally aligned memory. I never said that malloc knows about alignment of any one type. As long as your types don't have extended alignment, malloc gives you suitably aligned memory. – Kerrek SB May 19 '20 at 17:51
  • @KerrekSB You talk about struct alignment and you conclude with *All malloc() needs to do is to pick an address that's a multiple of that value.* Show us how do yo pass to malloc() the struct alignment... – Massimo May 19 '20 at 18:32
  • @Massimo: I think the "that" refers to the "largest alignment" from the previous paragraph, which is exactly the alignment guarantee malloc provides, isn't it? – Kerrek SB May 20 '20 at 10:35
29

I think the most relevant part of the Herb Sutter quote is the part I've marked in bold:

Alignment. Any memory Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee

It doesn't have to know what type you have in mind, because it's aligning for any type. On any given system, there's a maximum alignment size that's ever necessary or meaningful; for example, a system with four-byte words will likely have a maximum of four-byte alignment.

This is also made clear by the malloc(3) man-page, which says in part:

The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable.

ruakh
  • 175,680
  • 26
  • 273
  • 307
  • 3
    what is meaning of any kind of variable? it does't answer my question. does it mean malloc will always use maximum alignment size in any given system, right? – Chang Jan 06 '12 at 02:29
  • 2
    @Chang: effectively, yes. Also note, the quote is wrong. `new` is only guaranteed to have "any" alignment when allocating `char` or `unsigned char`. For others, it may have a smaller alignment. – Mooing Duck Jan 06 '12 at 02:33
  • @Chang: Right, the maximum alignment size. "Suitably aligned for any kind of variable" means "suitably aligned for an `int` *and* suitably aligned for a pointer *and* suitably aligned for any `struct` *and* . . .". – ruakh Jan 06 '12 at 02:35
  • @MooingDuck: `new char[16]` does not guarantee any alignment at all. (In general, `new T[n]` returns a pointer aligned for any type `X` where `sizeof(X)<=sizeof(T)`.) – aschepler Aug 28 '13 at 05:14
  • 1
    @aschepler: That's not true. See the C++11 spec, section 5.3.4, clause 10; `new char[16]` is specified in a way that's assumed to guarantee that it's suitably aligned for any type `X` where `sizeof(X)<=16`. – ruakh Aug 28 '13 at 05:33
  • @aschepler: No, `new T[n]` is only aligned for type `T`, unless `T` is (possibly signed/unsigned) `char`, then it's aligned for any type `X` where `sizeof (X) <= n`. – Ben Voigt Aug 28 '13 at 05:33
  • Um, yes. I somehow got several very wrong ideas from looking at the very same Standard paragraph. Reading comprehension fail. – aschepler Aug 28 '13 at 13:03
  • 1
    @BenVoigt: I think the "magic alignment" is only for `char` and `unsigned char`, but NOT for `signed char`. The C++ spec treats `char` and `unsinged char` as "byte" types, but does not cnosider `signed char` a "byte" type. (Implicitly, the spec doesn't actually say "byte types" as such.) – Mooing Duck Aug 28 '13 at 16:27
  • @MooingDuck: Looks like you're right, but I think that may be a defect in the Standard, since the accompanying note talks about generic *character arrays* which include all three: "this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed" (And that note in turn should probably say *character sequences*, since *character arrays* include wide character types as well) – Ben Voigt Aug 28 '13 at 16:36
  • @BenVoigt: I would posit _that_ was the defect, since `unsigned char` is used in byte-like ways in §3.8/5-6 §3.9/2-4, §3.10/10, and §5.3.4/10, and in none of those is `signed char` mentioned or implied. Also, §3.9/1 calls out `unsinged char` specifically with "all possible bit patterns of the value representation represent numbers". – Mooing Duck Aug 28 '13 at 16:51
5

The only information that malloc() can use is the size of the request passed to it. In general, it might do something like round up the passed size to the nearest greater (or equal) power of two, and align the memory based on that value. There would likely also be an upper bound on the alignment value, such as 8 bytes.

The above is a hypothetical discussion, and the actual implementation depends on the machine architecture and runtime library that you're using. Maybe your malloc() always returns blocks aligned on 8 bytes and it never has to do anything different.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 4
    In summary then, `malloc` uses the 'worst case' alignment because it doesn't know any better. Does that mean that `calloc` can be smarter because it takes two args, the number of objects and the size of a single object? – Aaron McDaid Jan 06 '12 at 02:15
  • 1
    Maybe. Maybe not. You'd have to look at your runtime library source to find out. – Greg Hewgill Jan 06 '12 at 02:16
  • 1
    -1, sorry. Your answer includes the truth, but it also includes disinformation. It's not a "maybe, maybe not" thing; it's specifically documented to work in a way that doesn't depend on the size. (Dunno *why* not, though. It seems like it would make perfect sense for it to do so.) – ruakh Jan 06 '12 at 02:18
  • The answer to my own question is No. I found this: ["The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable."](http://linux.die.net/man/3/calloc) Seem like the memalign function is potentially useful though: http://wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/3C/calloc – Aaron McDaid Jan 06 '12 at 02:22
  • see ruakh's reply, so malloc will always use maximum alignment size in any given system, right? – Chang Jan 06 '12 at 02:26
3

1) Align to the least common multiple of all alignments. e.g. if ints require 4 byte alignment, but pointers require 8, then allocate everything to 8 byte alignment. This causes everything to be aligned.

2) Use the size argument to determine correct alignment. For small sizes you can infer the type, such as malloc(1) (assuming other types sizes are not 1) is always a char. C++ new has the benefit of being type safe and so can always make alignment decisions this way.

Pubby
  • 51,882
  • 13
  • 139
  • 180
2

Previous to C++11 alignment was treated fairly simple by using the largest alignment where exact value was unknown and malloc/calloc still work this way. This means malloc allocation is correctly aligned for any type.

Wrong alignment may result in undefined behavior according to the standard but I have seen x86 compilers being generous and only punishing with lower performance.

Note that you also can tweak alignment via compiler options or directives. (pragma pack for VisualStudio for example).

But when it comes to placement new, then C++11 brings us new keywords called alignof and alignas. Here is some code which shows the effect if compiler max alignment is greater then 1. The first placement new below is automatically good but not the second.

#include <iostream>
#include <malloc.h>
using namespace std;
int main()
{
        struct A { char c; };
        struct B { int i; char c; };

        unsigned char * buffer = (unsigned char *)malloc(1000000);
        long mp = (long)buffer;

        // First placment new
        long alignofA = alignof(A) - 1;
        cout << "alignment of A: " << std::hex << (alignofA + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofA)
        {
            mp |= alignofA;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex <<mp << endl;
        A * a = new((unsigned char *)mp)A;
        mp += sizeof(A);

        // Second placment new
        long alignofB = alignof(B) - 1;
        cout << "alignment of B: " <<  std::hex << (alignofB + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofB)
        {
            mp |= alignofB;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex << mp << endl;
        B * b = new((unsigned char *)mp)B;
        mp += sizeof(B);
}

I guess performance of this code can be improved with some bitwise operations.

EDIT: Replaced expensive modulo computation with bitwise operations. Still hoping that somebody finds something even faster.

Patrick Fromberg
  • 1,313
  • 11
  • 37
  • 1
    It's not actually the compiler, it's the hardware itself. On x86 a misaligned memory access simply forces the processor to fetch the two sides of the memory boundary and piece the result together, so it's always "correct" if slower. On e.g. some ARM processors, you would get a bus error and a program crash. This is a bit of a problem because many programmers are never exposed to anything else than x86, and so may not know that the behaviour is actually undefined instead of merely decreasing performance. – Thomas Sep 06 '14 at 15:29
  • You are correct, its the hardware or cpu-microcode software but not the actual compiler that saves you on the x86 architecture. I really wonder why there is no more convenient api to handle this. As if C/C++ designers wanted developers to step into the trap. Reminds me of std::numeric_limits::min() trap. Anyone got that one right the first time? – Patrick Fromberg Sep 06 '14 at 23:56
  • Well, once you know what is going on, it's not too hard to change your programming style from all sorts of crazy type-punning to well-typed code, fortunately. The C type system makes it fairly easy to preserve type alignment as long as you don't go doing insane bit manipulation stuff without paying attention. Now pointer-aliasing-free code on the other hand has some much tougher semantics... – Thomas Sep 07 '14 at 12:39
  • I do not understand. You have the problem whenever you have your own little heap that you manage yourself. What use of placement new are you thinking about in your comment? – Patrick Fromberg Sep 07 '14 at 14:23
1

malloc has no knowledge of what it is allocating for because its parameter is just total size. It just aligns to an alignment that is safe for any object.

John Paul
  • 81
  • 6
1

You might find out the allocation bits for your malloc()-implementation with this small C-program:

#include <stdlib.h>
#include <stdio.h>

int main()
{
    size_t
        find = 0,
        size;
    for( unsigned i = 1000000; i--; )
        if( size = rand() & 127 )
            find |= (size_t)malloc( size );
    char bits = 0;
    for( ; !(find & 1); find >>= 1, ++bits );
    printf( "%d", (int)bits );
}
Bonita Montero
  • 2,817
  • 9
  • 22