Memory alignment and padding — difference between 32 and 64 bits

Question

I would like to understand the results got with "gcc -m32" and "gcc -m64" compilation on the following small code:

#include <stdio.h>
#include <stdlib.h>

int main() {

struct MixedData
{
 char Data1;
 short Data2;
 int Data3;
 char Data4;
};

struct X {
 char c;
 uint64_t x;
};

printf("size of struct MixedData = %zu\n", sizeof(struct MixedData));
printf("size of struct X = %zu\n", sizeof(struct X));
printf("size of uint64_t = %zu\n", sizeof(uint64_t));

return 0;
}

With "gcc -m32", the ouput is :

size of struct MixedData = 12
size of struct X = 12
size of uint64_t = 8

Is size of struct X equal to 12 because compiler sets the following padding?

struct X {
 char c;     // 1 byte
 char d[3];  // 3 bytes
 uint64_t x; // 8 bytes
};

If this is the case, what's the size of a single word with 32 bits compilation (4 bytes?)? If it is equal to 4 bytes, this would be consistent because 12 is a multiple of 4.

Now concerning the size of MixedData with "gcc -m32" compilation, I get "size of struct MixedData = 12". I don't understand this value because I saw that total size of a structure had to be a multiple of the biggest size attribute in this structure. For example, here into structure MixedData, the biggest attribute is int Data3 with sizeof(Data3) = 4 bytes; why don't we have rather the following padding:

struct MixedData
{
 char Data1;        // 1 byte
 char Datatemp1[3]; // 3 bytes
 short Data2;       // 2 bytes
 short Data2temp;   // 2 bytes
 int Data3;         // 4 bytes 
 char Data4;        // 1 byte
 char Data4temp[3]  // 3 bytes
};

So the total size of struct MixedData would be equal to 16 bytes and not 12 bytes like I get.

Can anyone see what's wrong about these 2 interpretations?

A similar issue is about "gcc -m64" compilation; the output is:

size of struct MixedData = 12
size of struct X = 16
size of uint64_t = 8

The size of struct X (16 bytes) seems to be consistent because I think that compiler in 64 bits mode sets the following padding:

struct X {
 char c;     // 1 byte
 char d[7];  // 7 bytes
 uint64_t x; // 8 bytes
};

But I don't understand the value of struct MixedData (12 bytes). Indeed, I don't know how compiler sets the padding in this case because 12 is not a multiple of memory word in 64 bits mode (supposing this one is equal to 8 bytes). Could you tell me the padding generated by "gcc -m64" in this last case (for struct MixedData) ?

short Data2temp[2];// 2 bytes that is 4 bytes actually. what did the disassembly show? — old_timer, Feb 13 '17 at 17:57
Where did you get this rule the size of the struct has to be a multiple of the largest item? — old_timer, Feb 13 '17 at 17:58
I compile the code snippet from intel i7 x86_64. Concerning the rule, I thought the size of structure had to be a multiple of the largest item because if I had an array of struct, the elements would be more directly reachable, wouldn't be it ? — , Feb 13 '17 at 18:04
then the initial MIxedData would be either 16 or 32 bytes with each element being 4 or 8 bytes. But it isnt. — old_timer, Feb 13 '17 at 18:10
This rule might be wrong. Actually, 2 parameters have to be taken into account : size of memory word and size of the largest item. If size of largest item is greater than memory word, then we do padding such that total size of structure is a multiple of memory word. Else if size of memory word is greater than we do padding such that total size is a multiple of memory word : but this isn't still explain why I get 12 bytes for MixedData structure with "gcc -m64", do you understand my issue ? — , Feb 13 '17 at 18:24
see my answer, without further command line options they are trying to align the accesses 16 bit on a 16 bit boundary and 32 on a 32. Requiring in this case one byte of padding, but then they add 3 more perhaps/probably to make the whole thing a multiple of 4. so the whole thing leaves the next thing aligned on a 32 bit boundary. — old_timer, Feb 13 '17 at 18:26

score 0 · Accepted Answer · answered Feb 13 '17 at 18:24

This one is a curiosity

struct
{
 char Data1;
 short Data2;
 int Data3;
 char Data4;
} x;

unsigned fun ( void )
{
    x.Data1=1;
    x.Data2=2;
    x.Data3=3;
    x.Data4=4;
    return(sizeof(x));
}

Compiling then disassembling

64

0000000000000000 <fun>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   c6 05 00 00 00 00 01    movb   $0x1,0x0(%rip)        # b <fun+0xb>
   b:   66 c7 05 00 00 00 00    movw   $0x2,0x0(%rip)        # 14 <fun+0x14>
  12:   02 00 
  14:   c7 05 00 00 00 00 03    movl   $0x3,0x0(%rip)        # 1e <fun+0x1e>
  1b:   00 00 00 
  1e:   c6 05 00 00 00 00 04    movb   $0x4,0x0(%rip)        # 25 <fun+0x25>
  25:   b8 0c 00 00 00          mov    $0xc,%eax
  2a:   5d                      pop    %rbp
  2b:   c3                      retq   

32

00000000 <fun>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   c6 05 00 00 00 00 01    movb   $0x1,0x0
   a:   66 c7 05 02 00 00 00    movw   $0x2,0x2
  11:   02 00 
  13:   c7 05 04 00 00 00 03    movl   $0x3,0x4
  1a:   00 00 00 
  1d:   c6 05 08 00 00 00 04    movb   $0x4,0x8
  24:   b8 0c 00 00 00          mov    $0xc,%eax
  29:   5d                      pop    %ebp
  2a:   c3                      ret

Understand that the m32 and m64 are perhaps poorly described, one is basically the 32 bit processor, 32 bit registers (ebx, eax, ax, ah but not rbx,rax) and the other 64 bit processor with 64 bit registers (rbx,ebx,bx,bh,bl)

There doesnt have to be a connection between the size of or construction of structs vs the instruction set chosen.

the interesting thing here is the size of the struct 1+2+4+1 = 8 so they could have done it in 8 bytes. Now they probably wanted the int aligned, so that would pad it by a byte, and perhaps they wanted the whole thing aligned on a 32 bit boundary adding 3 more so that is probably what happened. The 32 bit code does make this a bit clear, no only did they align the int they also aligned the short. So they pad between Data1 and Data2 to align Data2 on a 16 bit boundary then that makes Data3 aligned on a 32 bit boundary and Data3 is a byte so cant be unaligned. Pad the end to aligned the next thing in .data.

The 64 bit code looks broken, perhaps they want the linker to patch that one up.

00000000004004d6 <fun>:
  4004d6:   55                      push   %rbp
  4004d7:   48 89 e5                mov    %rsp,%rbp
  4004da:   c6 05 57 0b 20 00 01    movb   $0x1,0x200b57(%rip)        # 601038 <x>
  4004e1:   66 c7 05 50 0b 20 00    movw   $0x2,0x200b50(%rip)        # 60103a <x+0x2>
  4004e8:   02 00 
  4004ea:   c7 05 48 0b 20 00 03    movl   $0x3,0x200b48(%rip)        # 60103c <x+0x4>
  4004f1:   00 00 00 
  4004f4:   c6 05 45 0b 20 00 04    movb   $0x4,0x200b45(%rip)        # 601040 <x+0x8>
  4004fb:   b8 0c 00 00 00          mov    $0xc,%eax
  400500:   5d                      pop    %rbp
  400501:   c3                      retq

ahh, I see yes that is what they were doing. And that is what they did align both Data2 and Data3. I guess I should have made it generate the address to the whole struct...

struct
{
 char Data1;
 short Data2;
 int Data3;
 char Data4;
} x;

unsigned fun ( void )
{
    unsigned long long z;
    z=(unsigned long long)&x;
    x.Data1=1;
    x.Data2=2;
    x.Data3=3;
    x.Data4=4;
    return(sizeof(x));
}

int main ( void )
{
    fun();
}

producing

00000000004004d6 <fun>:
  4004d6:   55                      push   %rbp
  4004d7:   48 89 e5                mov    %rsp,%rbp
  4004da:   48 c7 45 f8 38 10 60    movq   $0x601038,-0x8(%rbp)
  4004e1:   00 
  4004e2:   c6 05 4f 0b 20 00 01    movb   $0x1,0x200b4f(%rip)        # 601038 <x>
  4004e9:   66 c7 05 48 0b 20 00    movw   $0x2,0x200b48(%rip)        # 60103a <x+0x2>
  4004f0:   02 00 
  4004f2:   c7 05 40 0b 20 00 03    movl   $0x3,0x200b40(%rip)        # 60103c <x+0x4>
  4004f9:   00 00 00 
  4004fc:   c6 05 3d 0b 20 00 04    movb   $0x4,0x200b3d(%rip)        # 601040 <x+0x8>
  400503:   b8 0c 00 00 00          mov    $0xc,%eax
  400508:   5d                      pop    %rbp
  400509:   c3                      retq

confirming the base address 0x60138.

The struct is not tied to the instruction set. Change to this

struct
{
 char Data1;
 short Data2;
 int Data3;
 char Data4;
} __attribute__((packed)) x;

unsigned fun ( void )
{
    unsigned long long z;
    z=(unsigned long long)&x;
    x.Data1=1;
    x.Data2=2;
    x.Data3=3;
    x.Data4=4;
    return(sizeof(x));
}

int main ( void )
{
    fun();
}

and we get this

00000000004004d6 <fun>:
  4004d6:   55                      push   %rbp
  4004d7:   48 89 e5                mov    %rsp,%rbp
  4004da:   48 c7 45 f8 38 10 60    movq   $0x601038,-0x8(%rbp)
  4004e1:   00 
  4004e2:   c6 05 4f 0b 20 00 01    movb   $0x1,0x200b4f(%rip)        # 601038 <x>
  4004e9:   66 c7 05 47 0b 20 00    movw   $0x2,0x200b47(%rip)        # 601039 <x+0x1>
  4004f0:   02 00 
  4004f2:   c7 05 3f 0b 20 00 03    movl   $0x3,0x200b3f(%rip)        # 60103b <x+0x3>
  4004f9:   00 00 00 
  4004fc:   c6 05 3c 0b 20 00 04    movb   $0x4,0x200b3c(%rip)        # 60103f <x+0x7>
  400503:   b8 08 00 00 00          mov    $0x8,%eax
  400508:   5d                      pop    %rbp
  400509:   c3                      retq

the size of the struct is now 8 bytes, and they generated unaligned accesses.

I pretty much never disassemble nor examine x86, I normally stick to arm, so the very long mov instructions were not what I was expecting. I intentionally didnt optimize as all of this is dead code and would go away, and needed to see them actually generate code to talk to the struct. so perhaps other instructions would have been chosen if this were optimized and not dead code. — old_timer, Feb 13 '17 at 18:28
Not only is the 32 bit access (movl) aligned on a 32 bit boundary (not packed) but the 16 bit access (movw) is also aligned on at least a 16 bit boundary). If you changed the last element to a short instead of an int it should keep the size of the struct the same (12) but change the offset for the movw that replaces the movb, aligning that access as well (on a 16 bit boundary). — old_timer, Feb 13 '17 at 18:37
thanks for your answer, however I am doing confusions : what do you mean saying "they are trying to align accesses 16 bit on 16 bit boundary and 32 on 32" ? You mean that compiler tries to set the total size of structure to a multiple of 4 bytes (with gcc -m32) or 8 bytes (with gcc -m64) ? regards — , Feb 13 '17 at 19:26
they turn the 8 byte structure into 12. offset 0 Data 1, offset 1 skipped to pad for Data2, offset 2 Data2, offset 3 part of Data2. offset 4 Data 3. the byte at offset appears to be there to align Data 2. — old_timer, Feb 13 '17 at 20:48
-m32 and -m64 are irrelevant to how the structure was formed — old_timer, Feb 13 '17 at 20:48
where -m32 and -m64 MIGHT matter is the definition of char, short and int. In the 16 bit days an int was 16 and then when the 32 bit x86 processor compilers came out an int was 32, but long was 32 for both. Now I think for the 64 bit compilers (gcc) an int is still 32 bits but a long is now 64, so if you did a long instead of a int in this struct then -m32 vs -m64 might matter but that would only be because the definition changed. Thus the reason for the uint32_t stuff to avoid this. — old_timer, Feb 13 '17 at 20:50
-m32 vs -m64 have nothing to do with your structure or size or offsets, not relevant. As my answer demonstrates, just look at it. — old_timer, Feb 13 '17 at 20:51
just a little remark, you say that -m32 and -m64 produce the same structure, this is not exactly right, with -m32, sizeof(X) = 12 and with -m64, sizeof(X) = 16, so the padding is not the same — , Feb 14 '17 at 11:04
depends on the compiler -m64 I got mov $0xc,%eax and with -m32 mov $0xc,%eax, same 12 byte structure in both cases. the -m64 was shown in the code in the answer. — old_timer, Feb 14 '17 at 19:20
gcc --version gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 — old_timer, Feb 14 '17 at 19:20

Memory alignment and padding — difference between 32 and 64 bits

1 Answers1