1

I was learning about struct padding in C and came across this video.

Basically it says if I have a struct

struct abc {
    char a;    // 1 byte
    char b;    // 1 byte
    int c;     // 4 bytes
} var;

Then, instead of storing the struct like this (c,...,c denote the four bytes of c; || is the word boundary; _ is a place for byte)

_  _  _  _ || _  _  _  _
a  b  c  c    c  c    

Two bytes of empty space will be padded after b, resulting in (e denotes empty)

_  _  _  _ || _  _  _  _
a  b  e  e    c  c  c  c

So that the CPU can get int c in one CPU cycle.

However, this does build on the assumption that the first member (a in my case) of struct will be stored immediately after word boundary. Is it always so?

jleng
  • 55
  • 1
  • 7
  • Which compiler are you using? – Tony Tannous Apr 29 '21 at 18:35
  • And which CPU are you targeting? – Shawn Apr 29 '21 at 18:40
  • It sounds like what you are interested in is the alignment requirements for structs. – Christian Gibbons Apr 29 '21 at 18:46
  • I feel it important to note that it is not word boundaries that are at play here, but alignment requirements. There can be padding even within a word. For example, try a struct that starts with a `char` as the first element, and a `short` as the second. – Christian Gibbons Apr 29 '21 at 19:02
  • @Tony I'm using the default GCC from Segger Embedded Studio. – jleng Apr 29 '21 at 19:03
  • @Shawn It's an Arm Cortex M4F – jleng Apr 29 '21 at 19:04
  • @ChristianGibbons Thanks! But I'm not sure if in the example I have above, does padding always happen? I mean if 'a' and 'b' can be stored at the latter two bytes of the first word, then it seems padding doesn't happen because it's no longer necessary. – jleng Apr 29 '21 at 19:15
  • I believe Eric Postpischil's adequately answers that question. To boil it down, your struct's alignment requirement will be on a 4-byte boundary, because the strictest alignment requirement of any of its members is your `int` which has a 4-byte alignment requirement. Therefore `a` will be aligned on a 4-byte boundary, b will be offset from it by 1, and then you will have two bytes of padding before you reach the next 4-byte boundary. – Christian Gibbons Apr 29 '21 at 19:47
  • @ChristianGibbons: Got it. Thanks! – jleng Apr 29 '21 at 20:41

3 Answers3

3

The address of an object of a structure type is always equal to the address of the first member of the object.

From the C Standard (6.7.2.1 Structure and union specifiers)

15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Here is a demonstrative program

#include <stdio.h>

int main(void) 
{
    struct abc
    {
        char a;
        char b;
        int c;
    } abc = { 'A', 'B', 3 };
    
    printf( "&abc = %p, &abc.a = %p\n", ( void * )&abc, ( void * )&abc.a );
    
    struct abc *p = &abc;
    
    printf( "*( char * )p = %c\n", *( char * )p );
    
    return 0;
}

The program output might look like

&abc = 0x7ffe8cfad6c0, &abc.a = 0x7ffe8cfad6c0
*( char * )p = A
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
3

However, this does build on the assumption that the first member of struct will be stored immediately after word boundary. Is it always so?

Yes.

When a structure type is defined, the alignment requirement of the structure will be at least the strictest alignment requirement of its members. For example, if a structure has members with alignment requires of 1 byte, 8 bytes, and 4 bytes, the alignment requirement of the structure will be 8 bytes. The compiler will figure this out automatically when the structure is defined. (Technically, the C standard might permit the compiler to give the structure an even greater alignment—I do not see any rule against it—but that is not done in practice.)

Then, whenever the C implementation reserves memory for a structure object (as when you define an object of that type, such as struct foo x), it will ensure the memory is aligned as required for that structure. That results in the alignment requirements of the members being satisfied too. When a program allocates memory with malloc, the returned memory is always aligned as necessary for any object of the requested size.

(If you do any “funny stuff” in a program to set your own memory locations for objects, such as placing one in the middle of memory allocated with malloc, you are responsible for getting the alignment right.)

Further, the structure will be padded at the end if necessary so that its total size is a multiple of that alignment requirement. Then, in an array of those structures, each successive element of the array will begin at a properly aligned location too.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Thank you! Can you please elaborate on "the alignment requirement of the structure will be at least the strictest alignment requirement of its members"? An example would be highly appreciated. – jleng Apr 29 '21 at 19:22
  • @jleng: If the member of a structure have alignment requires of 1 byte, 1 byte, 4 bytes, 8 bytes, 1 byte, and 4 bytes, then the alignment requirement of the structure will be 8 bytes, since that is the strictest alignment requirement of the members. – Eric Postpischil Apr 29 '21 at 19:57
  • @jleng: Most platforms require multi-byte objects to be "aligned" such that they start on an address that's a multiple of 2 or 4 or 8 (depending on the platform, the type, and other considerations). If a member of a `struct` must be aligned such that its address is a multiple of 4, then the `struct` object itself will also be aligned such that its address is a multiple of 4. – John Bode Apr 29 '21 at 20:03
  • @JohnBode: Thank you. This clears up all the confusions I had left. – jleng Apr 29 '21 at 20:40
1

This is an optimization the compiler makes because it's easier for the CPU. Most compilers should allow you to disable it. For example, in GCC you can use __attribute__((packed)).

Also see How to override C compiler aligning word-sized variable in struct to word boundary.

Nasrat Takoor
  • 344
  • 3
  • 8