what is difference between structure padding in c on 32bit and 64bit architecture?

Question

i read that memory is arranged as a group of 4 bytes in 32 bit processor and 8 bytes in 64 bit processor from http://fresh2refresh.com/c-programming/c-structure-padding/ but didn't clarify the difference between these two.

struct structure2 
    {
           int id1;
           char name;
           int id2;
           char c;
           float percentage;                      
    };

The C standard does not enforce any kind of alignment. It solely depends on the ABI of the platform. This is not just a matter of the CPU, e.g. on some 32 bit architectures, 64 bit alignment can be used, e.g. for cache/bus/RAM efficiency. It is not just a matter of the register-width of the CPU. In other words: your question is too broad. — too honest for this site, Sep 16 '16 at 09:10
Possible duplicate of [Structure padding and packing](http://stackoverflow.com/questions/4306186/structure-padding-and-packing) — msc, Sep 16 '16 at 09:10
It depends on your compiler, on your compiler settings and on the target platform. — Jabberwocky, Sep 16 '16 at 09:37
@Olaf whether it always use 32 bit architecture on 64 bit, when does it uses 8 bytes rather than 4 bytes — Rushikesh Gaidhani, Sep 16 '16 at 09:54
@RUSHIKESHGAIDHANI: Your comment is not clear. What do you mean? — too honest for this site, Sep 16 '16 at 10:20
1 byte = 8 bits. 32/8=4 but 64/8=8. What part of that isn't clear? — Lundin, Sep 16 '16 at 13:27

Raman · Accepted Answer · 2018-05-18T16:14:04.213

By 32 bit processor (more specifically speaking, It is talking about size of data bus rather than size of registers), It means 32 bits(4 Bytes) of data will be read and processed at a time.

Now, Consider an int:

int a=10; //assuming 4 bytes

00000000 000000000 00000000 00001010

Assuming little endian architecture, it would be stored as:

------------------------------------------------------------------------
| 00001010  |  00000000  |  00000000   |  00000000   |  <something_else>   
-------------------------------------------------------------------------
  1st byte     2nd byte     3rd byte      4th byte
\--------------------------------------------------/
                         |
               4 bytes processed together

In this case when the processor will read the data to be processed, It can process the entire integer in one go (all 4 bytes together)(In 1 machine cycle more strictly speaking)

However consider a case where the same integer was stored as,

------------------------------------------------------------------------
|<something_else>| 00001010  |  00000000  |  00000000   |  00000000   |     
-------------------------------------------------------------------------
  1st byte     2nd byte     3rd byte      4th byte
\------------------------------------------------------/
                         |
               4 bytes processed together

In this case, the processor would need 2 machine cycles to read the integer.

Most of the architecture always try to minimize the CPU cycles. Hence the 1st arrangement in memory is preferred by many compilers and thus enforce alignment requirements (padding). So 4 byte ints are stored in addresses starting at multiple of 4s, chars are stored in multiple of 1s, 8 byte doubles are stored in multiple of 8s, 8 byte long long int in multiple of 8s and so on...

Now consider your structure

struct structure2 
{
       int id1;   //assuming 4 byte int
       char name;  // 1byte
       int id2;    //4 byte
       char c;     // 1 byte
       float percentage;    //assuming 4 byte float                  
};

id1 will get stored in some address(starting multiple of 4)in memory and take 4 bytes.

name will take the next byte.

now if id2 gets stored in next byte, It will break the alignment rule above. So it would Leave 3 Bytes of padding and get stored starting with adress which is next multiple of 4 and will take 4 bytes.

For c again the same thing happens as name. It takes next 1 byte and keeps 3 byte of padding.

At last percentage gets stored in next 4 bytes.

So total size of structure becomes 20 bytes.

A more complicated case would be say

struct mystructure
{
   char        a; //1 byte char
   double      b; // 8 byte double
   int         c; // 4 byte int
}

Here one may at first glance say that size would be 20 bytes(1 byte for char + 7 byte padding + 8 byte for double + 4 byte for int).

However the actual size would be 24 bytes.

Say somebody declared an array of this structure

struct mystructre arr[4];

Here(assuming 20 byte structure) although arr[0] is properly aligned, but if you check carefully you'll find that arr[1].b is misaligned. So 4 bytes of extra padding is added at the end of structure to make the structure size multiple of its alignment.(Every structure also has its own alignment requirements).

Hence the total size would be 24 bytes.

The size of the integer,long etc. are decided by the compiler. The compiler generally takes care of the processor architecture but it may choose not to.

Similarly, whether to use padding or not is decided by the compiler. Not padding is known as packing. Some compilers have explicit options for allowing packing.

In GCC(GNU C compiler) you can do it with __attribute__((__packed__)), so in the following code

struct __attribute__((__packed__)) mystructure2 
{
 char a;
 int b;
 char c;
};

mystructure2 has size 6 bytes because of explicit request to pack the structure. This structure will be slower to process.

You can probably figure it out yourself by now, what would happen in 64 bit processor, or if size of int was different.

This is true for every commonly-use ABI on real systems (AFAIK), but the C standard allows an ABI where struct members are always packed with no padding (instead of being naturally aligned within the struct). This design choice is motivated by performance, not by any requirement in the C standard, so it's good that you started out showing how a misaligned load can be slower. (modern x86 CPUs have fast hardware support for unaligned loads/stores, and no penalty as long as they don't cross a cache-line boundary.) Also, 1 cycle/access is a (justifiable) simplification of throughput and latency. — Peter Cordes, Sep 16 '16 at 15:42
Also worth mentioning that most real compilers have a way to specify packed structs, with no padding. e.g. in GNU C: `struct __attribute__((packed)) foo { uint16_t u16; uint32_t u32; uint8_t u8; };` will be 7 bytes, with no trailing padding. (check it out on the [Godbolt compiler explorer](https://godbolt.org/g/p92mGV), where I put an example to play around with, using sizeof() and offsetof() on one of the members of the same struct with/without the `packed` attribute. — Peter Cordes, Sep 16 '16 at 15:53

score 0 · Answer 2 · answered Sep 16 '16 at 09:12

This website does not precise exactly which kind of 64-bit platform is used, but seems to assume an ILP64 (int, long and pointers are 64-bit) platform with length-aligned integers. Which means an int is four bytes on a 32-bit processor, and eight bytes on a 64-bit processor, and each must be aligned on a multiple of its own length.

The result is a change in the length of the padding between name and id2 (padding necessary to preserve id2's alignment).

On a 32-bit platform, there would be three bytes of padding; on a 64-bit platform, there would be seven.

The padding between c and percentage will likely not change, because the size of floating-point variables is not affected by the processor's bitness.

what is difference between structure padding in c on 32bit and 64bit architecture?

2 Answers2