Why does the structure size differ in 32-bit and 64-bit program?

Question

The following is a simple C program:

#include <stdio.h>

typedef struct
{
    char a;
    double b;
} A;
int main(void) {
    printf("sizeof(A) is %d bytes\n", sizeof(A));
    return 0;
}

When I compiled it into 32-bit program, the output is:

sizeof(A) is 12 bytes

I know the structure memory modle should be:

 ____________________________
|a|3 padding| b              |
 ————————————————————————————

But When I compiled it into 64-bit program, the output is:

sizeof(A) is 16 bytes

So the structure memory modle should be:

 ____________________________________
|a|7 padding        | b              |
 ____________________________________

Personally, I think no matter the program is 32-bit or 64-bit, the size of structure should always be 16 bytes (since char is 1 byte long, and the alignment of double is 8 bytes). Why the size is 12 bytes in 32-bit program?

Compiler can put any number of padding bytes after structure members as per C specs. — Mohit Jain, Jan 08 '15 at 06:33
On a 32-bit platform, the maximal alignment is 4 bytes. There's simply no 8-byte aligned type. 8-byte values need to be stored as two words in memory, or in two registers in the case of 64-bit integer types. Hence there's no need (nor any performance improvement in) aligning data to 8-byte boundaries. — The Paramagnetic Croissant, Jan 08 '15 at 06:33
@TheParamagneticCroissant: wrong. See http://stackoverflow.com/questions/11108328/double-alignment. Windows aligns doubles on 8 byte boundaries, even on 32 bit machines. — eckes, Jan 08 '15 at 06:57
@TheParamagneticCroissant: on a 32-bit Intel platform, you're correct. On almost any other platform (SPARC, PowerPC, …), you are incorrect and an 8-byte type like `double` needs to be aligned on an 8-byte boundary. We can infer from the question that the OP is in fact using Intel. — Jonathan Leffler, Jan 08 '15 at 07:18
@JonathanLeffler yeah, right. I wish I could edit my comment to amend it. — The Paramagnetic Croissant, Jan 08 '15 at 07:25

score 2 · Accepted Answer · edited Jan 25 '20 at 11:56

After delving into this question, I want to answer the question myself.

My OS is Solaris, and this issue occurs on X86(Jonathan Leffler's comment is right). When I test it on SPARC, both 32-bit and 64-bit program output "sizeof(A) is 16 bytes".

I think the reasons are:

On X86, accessing a non-aligned data will not cause the program to down, it will only affect the performance. For 32-bit program, the CPU instructions can access 4-byte of memory at a time, so accessing 8-byte double will use 2 instructions, it has no need to align the data. But for 64-bit program, the CPU instructions can access 8-byte of memory at a time, so aligning the 8-byte double will use only 1 instruction to get the data, and this will improve performance.
Ox SPARC, the data requires strict alignment, else it will cause "Bus error". So there are always 7 padding bytes before double data to make it aligned on the 8-byte address.

Simply, this issue depends on CPU, as user694733 has answered.

score 2 · Answer 2 · answered Jan 25 '20 at 10:46

On Intel CPU, BOTH 32-bit and 64-bit machine, floating-point instruction of the "SIMD" variation read/write either 16 bytes ( 2 doubles) or 8 bytes (a single double), at a single machine instruction. Those are the most common instructions for handling floating-points. It is all a matter of SPEED:

Reading a single data item may be done either by "the aligned read instruction", or by the "unaligned read instruction". The aligned version is ensured to be quicker. Unaligned instruction has to deal with complicated cases where the data is split between two cache lines, or even two different memory pages. Further more, the CPU is optimized for certain instructions, namely the aligned ones. So much optimized, that reading a 1-byte data is more time-consuming then reading 16 aligned bytes. The archaic 1-bytes instructions of the 8088 ( MOV AL / MOV AH etc) are not hardware optimized.

The Compiler writer has to choose either a dense code, or a fast code. In the old days when my PC had 16 KB of memory, memory was scarce. Later, one could instruct the compiler exactly how to align structure members. When 64-bits CPU came out, memory was cheap enough that structures sizes became a multiple of 16-bytes, and each structure member is aligned on its natural boundary - according to its type: even address for shorts, mod(4,0) for int and floats, mod(8,0) for _int64 and doubles, mod(16,0) for _mm128, mode(32,0) for _mm256

score 1 · Answer 3 · answered Jan 08 '15 at 06:34

1

It is implementation defined.

In the end it's depended on what limits the CPU instructions have when accessing memory. Compilers in general try to pick layout that is most efficient in speed first, and in memory usage second.

answered Jan 08 '15 at 06:34

user694733

15,208
2
42
68

It would be reasonable for a 32/64 bit compiler to favor memory use more when in 32 bits mode. After all, there's probably a reason the programmer chose to compile in 32 bits mode. – MSalters Jan 08 '15 at 07:48

Why does the structure size differ in 32-bit and 64-bit program?

3 Answers3