0
typedef struct {    
    char c;
    char cc[2];
        short s;
    char ccc;
}stuck;

Should the above struct have a memory layout as this ?

1   2   3   4   5  6     7
- c -  cc   -   s  - ccc -

or this ?

1    2   3   4    5   6   7     8 
-    c   -   cc   -   s   - ccc -

I think the first should be better but why my VS09 compiler chooses the second ? (Is my layout correct by the way ?) Thank you

Dalton
  • 181
  • 1
  • 8
  • 6
    I don't understand your memory layout diagrams – Andreas Brinck Aug 23 '11 at 07:50
  • Implementation defined. There is no answer to this question unless you are much more specific about (Compiler (including version)/ OS (including version)/ Optimization level of compiler (as this may change layout). Any compiler specific attributes you have attached to the struct. – Martin York Aug 23 '11 at 14:55

5 Answers5

4

I think that your structure will have the following layout, at least on Windows:

typedef struct {    
    char c;
    char cc[2];
    char __padding;
    short s;
    char ccc;
    char __tail_padding;
} stuck;

You could avoid the padding by reordering the structure members:

typedef struct {    
    char c;
    char cc[2];
    char ccc;
    short s;
} stuck;
wilx
  • 17,697
  • 6
  • 59
  • 114
  • Thank you, but how are you sure that it should be the first layout in your post ? – Dalton Aug 23 '11 at 08:01
  • 1
    __1 for showing `__padding` and then reordering elements. (by the way, __1 means two `dash` followed by `1` : rotate one dash by 90 degree, and adjust, you get `+1` :D) – Nawaz Aug 23 '11 at 08:02
  • Okay, I understand now after I check its element addresses, my Brain really goes Bernard – Dalton Aug 23 '11 at 08:16
  • @wilx: it would be good, I think, to mention the tail-padding added after `ccc` in the first case (one `char`). – Matthieu M. Aug 23 '11 at 08:20
  • @Matthieu M.: I did not realise there would be some, but you are right, it is necessary. Thanks! – wilx Aug 23 '11 at 08:32
2

The compiler can't choose the second. The standard mandates that the first field must be aligned with the start of the structure.

Are you using offsetof from stddef.h for finding this out ?

6.7.2.1 - 13

A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

It means that you can have

struct s {
    int x;
    char y;
    double z;
};

struct s obj;
int *x = (int *)&obj; /* Legal. */

Put another way

offsetof(s, x); /* Must yield 0. */
cnicutar
  • 178,505
  • 25
  • 365
  • 392
  • Thanks for your quick reply, I am no understanding what you actually mean. More details are needed, please, – Dalton Aug 23 '11 at 07:48
  • 1
    You can also print the addresses and make the difference yourself: printf("%p %p %p\n", &obj.x, &obj.y, &obj.z); – Patrick B. Aug 23 '11 at 07:55
2

Other than at the beginning of a structure, an implementation can put whatever padding it wants in your structures so there's no right way. From C99 6.7.2.1 Structure and union specifiers, paragraphs:

/12:
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.

/13:
There may be unnamed padding within a structure object, but not at its beginning.

/15:
There may be unnamed padding at the end of a structure or union.

Paragraph 13 also contains:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared.

This means that the fields within the structure cannot be re-ordered. And, in a large number of modern implementations (but this is not mandated by the standard), the alignment of an object is equal to its size. For example a 32-bit integer data type may have an alignment requirement of four (8-bit) bytes.

Hence, a logical alignment would be:

offset  size  field
------  ----  -----
   0      1   char c;
   1      2   char cc[2];
   3      1   padding
   4      2   short s;
   6      1   char ccc;
   7      1   padding

but, as stated, it may be something different. The final padding is to ensure that consecutive array elements are aligned correctly (since the short most likely has to be on a 2-byte boundary).

There are a number of (non-portable) ways in which you may be able to control the padding. Many compilers have a #pragma pack option that you can use to control padding (although be careful: while some systems may just slow down when accessing unaligned data, some will actually dump core for an illegal access).

Also, re-ordering the elements within the structure from largest to smallest tends to reduce padding as well since the larger elements tend to have stricter alignment requirements.

These, and an even uglier "solution" are discussed more here.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thanks for your confirmation, I up-voted you and others who joined to answer my question. – Dalton Aug 23 '11 at 08:17
  • 1
    @paxdiablo: I would think there is a second `char` of padding after `ccc`, for array-alignment purposes. – Matthieu M. Aug 23 '11 at 08:22
  • Good point, @Matthieu, I didn't take arrays into account, proof that even an obnoxiously self-confident old-timer can still be wrong :-) Adjusted answer to suit. – paxdiablo Aug 23 '11 at 08:27
1

While I do really understand your visual representation of the alignment, I can tell you that with VS you can achieve a packed structure by using 'pragma':

__pragma( pack(push, 1) )
struct { ... };
__pragma( pack(pop) )

In general struct-alignment depends on the compiler used, the target-platform (and its address-size) and the weather, IOW in reality it is not well defined.

Patrick B.
  • 11,773
  • 8
  • 58
  • 101
0

Others have mentionned that padding may be introduced either between attributes or after the last attribute.

The interesting thing though, I believe, is to understand why.

Types usually have an alignment. This property precises which address are valid (or not) for a particular type. On some architecture, this is a loose requirement (if you do not respect it, you only incur some overhead), on others, violating it causes hardware exceptions.

For example (arbitrary, as each platform define its own):

  • char: 1
  • short (16 bits): 2
  • int (32 bits): 4
  • long int (64 bits): 8

A compound type will usually have as alignment the maximum of the alignment of its parts.


How does alignment influences padding ?

In order to respect the alignment of a type, some padding may be necessary, for example:

struct S { char a; int b; };

align(S) = max(align(a), align(b)) = max(1, 4) = 4

Thus we have:

// S allocated at address 0x16 (divisible by 4)
0x16 a
0x17 
0x18
0x19
0x20 b
0x21 b
0x22 b
0x23 b

Note that because b can only be allocated at an address also divisible by 4, there is some space between a and b, this space is called padding.


Where does padding comes from ?

Padding may have two different reasons:

  • between attributes, it is caused by a difference in alignment (see above)
  • at the end of the struct, it is caused by array requirements

The array requirement is that elements of an array should be allocated without intervening padding. This allows one to use pointer arithmetic to navigate from an element to another:

+---+---+---+
| S | S | S |
+---+---+---+

S* p = /**/;
p = p + 1; // <=> p = (S*)((void*)p + sizeof(S));

This means, however, than the structure S size needs be a multiple of S alignment.

Example:

struct S { int a; char b; };

+----+-+---+
|  a |b| ? |
+----+-+---+

a: offset 0, size 4
b: offset 4, size 1
?: offset 5, size 3 (padding)

Putting it altogether:

typedef struct {    
    char a;
    char b[2];
    short s;
    char c;
} stuck;

+-+--+-+--+-+-+
|a| b|?|s |c|?|
+-+--+-+--+-+-+

If you really wish to avoid padding, one (simple) trick (which does not involve addition nor substraction) is to simply order your attributes starting from the maximum alignment.

typedef struct {
  short s;
  char a;
  char b[2];
  char c;
} stuck;

+--+-+--+-+
| s|a| b|c|
+--+-+--+-+

It's a simple rule of thumb, especially as the alignment of basic types may change from platform to platform (32bits/64bits) whereas the relative order of the types is pretty stable (exception: the pointers).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722