26

I also want to know whether glibc malloc() does this.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Anonymous
  • 1,287
  • 6
  • 20
  • 21

4 Answers4

85

Suppose that you have the structure.

struct S {
    short a;
    int b;
    char c, d;
};

Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):

 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d|  bytes
|       |       |  words

The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.

But if the memory was laid out as:

 0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
|       |       |       |

Then access to b becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)

Different data types have different alignment requirements. It's common for char to be 1-byte aligned, short to be 2-byte aligned, and 4-byte types (int, float, and pointers on 32-bit systems) to be 4-byte aligned.

malloc is required by the C standard to return a pointer that's properly aligned for any data type.

glibc malloc on x86-64 returns 16-byte-aligned pointers.

dan04
  • 87,747
  • 23
  • 163
  • 198
  • 1
    sorry, I don't undertand what do you mean by "it's common for char to be 1-byte aligned, short to be 2-byte aligned, and 4-byte types". – Anni_housie Nov 19 '16 at 07:42
  • 2
    @Anni_housie it just means, most systems commonly use 1byte of memory to store a char, 2bytes to store short, 4bytes for int/float/pointer and so on – Richardson Ansong Nov 19 '16 at 09:32
  • 3
    Or you could just re-order elements in your struct to have the int first then the short then the two chars.. that way its going to be easy for the system to read them. – Omarito May 04 '19 at 22:45
  • Indeed, the order the members of a struct are listed in affects the final size of the struct because of the alignment of the members. – ljleb Jan 26 '22 at 00:58
  • @dan04 thanks for the answer, now I can get why 2 bytes padding are inserted between a and b. I read somewhere saying in your example there will also be another 2 bytes padding inserted in the end after member c to align the entire structure, why is this needed? – torez233 Mar 07 '22 at 04:47
  • 2
    @torez233 i think it's because in the example of above with 32 bit architecture, the final 2 padding bytes will help align the next allocated instance of this structure correctly. Otherwise next allocation would have its "a" bytes occupying the final 2 bytes of the first instance's memory, next to its "c|d" bytes. on a 32 bit architecture, each memory address is 32 bits, so there were always be 32 bits loaded even if you just want the "c" char for example – Nate Wilson May 24 '22 at 20:10
  • @NateWilson: Exactly. – dan04 May 25 '22 at 16:27
12

Alignment requirements specify what address offsets can be assigned to what types. This is completely implementation-dependent, but is generally based on word size. For instance, some 32-bit architectures require all int variables start on a multiple of four. On some architectures, alignment requirements are absolute. On others (e.g. x86) flouting them only comes with a performance penalty.

malloc is required to return an address suitable for any alignment requirement. In other words, the returned address can be assigned to a pointer of any type. From C99 §7.20.3 (Memory management functions):

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
1

The malloc() documentation says:

[...] the allocated memory that is suitably aligned for any kind of variable.

Which is true for most everything you do in C/C++. However, as pointed out by others, many special cases exist and require a specific alignment. For example, Intel processors support a 256 bit type: __m256, which is most certainly not taken in account by malloc().

Similarly, if you want to allocate a memory buffer for data that is to be paged (similar to addresses returned by mmap(), etc.) then you need a possibly very large alignment which would waste a lot of memory if malloc() was to return buffers always aligned to such boundaries.

Under Linux or other Unix systems, I suggest you use the posix_memalign() function:

int posix_memalign(void **memptr, size_t alignment, size_t size);

This is the most current function that one wants to use for such needs.


As a side note, you could still use malloc(), only in that case you need to allocate size + alignment - 1 bytes and do your own alignment on the returned pointer: (ptr + alignment - 1) & -alignment (not tested, all casts missing). Also the aligned pointer is not the one you'll use to call free(). In other words, you have to store the pointer that malloc() returned to be able to call free() properly. As mentioned above, this means you lose up to alignment - 1 byte per such malloc(). In contrast, the posix_memalign() function should not lose more than sizeof(void*) * 4 - 1 bytes, although since your size is likely a multiple of alignment, you would only lose sizeof(void*) * 2... unless you only allocate such buffers, then you lose a full alignment bytes each time.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
1

If you have particular memory alignemnt needs (for particular hardware or libraries), you can check out non-portable memory allocators such as _aligned_malloc() and memalign(). These can easily be abstracted behind a "portable" interface, but are unfortunately non-standard.

André Caron
  • 44,541
  • 12
  • 67
  • 125