6

here is a declaration of a C struct:

struct align
{
    char c; //1 byte
    short s;//2 bytes
};

On my environment, sizeof(struct align) is 4 and the padding 1 byte is between 'char c' and 'short s'. Some say that's because `short' has to be 2-byte aligned, so pading 1 byte is after 'char c'. On 32-bit machine, I know 'int' better be 4-byte aligned to prevent 2 memory read cycles since addresses sent on address bus between CPU and memory is a multiple of 4. But 'short' is 2 bytes, which is less than 4 bytes, so its address could be any byte within a 4-byte unit (except last byte).

multiple of 4 address -> |0|1|2|3|

I mean, 'short' can start at 0, 1, or 2. All can be retrieved by 1 read cycle, doesn't have to 0 or 2. In my 'struct align' case, 'char c' could be at 0, 'short s' could be at 1-2, padding could be at 3.

Why 2-byte long "short" has to be 2-byte aligned?

Thanks

Update my environment: gcc version 4.4.7, i686, Intel

password636
  • 981
  • 1
  • 7
  • 18
  • What compiler and platform are you using? What compiler options do you have set? – Dai Apr 12 '14 at 06:53
  • It depends on your compiler and system architecture. You should include those details in your system. On some systems, `short` must have 2-byte alignment. – M.M Apr 12 '14 at 07:05
  • 2
    On many machines, accessing an N-byte quantity (at least for N in {1, 2, 4, 8, 16}) works most efficiently when the quantity is N-byte aligned. It's the way life is; get used to it, because I doubt that chip manufacturers are going to change it just because you think it isn't the way it should be. – Jonathan Leffler Apr 12 '14 at 07:14
  • @Dai, update my environment, no specific compile options. Just "gcc". – password636 Apr 12 '14 at 08:14
  • @MattMcNabb, update my environment, my system is a normal 32-bit CentOS with Intel chip. – password636 Apr 12 '14 at 08:17

1 Answers1

4

That is because a member of a struct is no difference from a single variable of that type, from the machine's perspective. Whatever alignment you choose, it applies to both.

For example, if short is two-byte long,

struct align
{
    char c;
    short s; // two-byte word
};

short ss; // two-byte word

The member s is of 2-byte type (e.g. WORD in IA32), exactly the same type of a "standalone" variable ss. The underlying architecture regards them as the same. So when there comes to an alignment requirement for that type, it just applies to both.

And if you add the padding at the end of the data, it may still be misaligned. Consider the start of ss is at the end of a 4-byte boundary.

Eric Z
  • 14,327
  • 7
  • 45
  • 69
  • yes, I think your explanation is the reason. From the "standalone" variable's point of view, 2-byte aligned guarantees that a "short" would nevel cross two 4-byte uints, and CPU can always get a "short" in 1 read cycle on a 32-bit machine. – password636 Apr 12 '14 at 08:17
  • @password636 but if the char c is aligned, it guarantees that the short won't be across the boundary. Why is it still required? – Bharel Feb 04 '19 at 13:59