Padding logic of 'double' struct members on 32-bits machines

Question

As per this link https://www.geeksforgeeks.org/structure-member-alignment-padding-and-data-packing/ , on a 32-bit machine where size of data bus = 4 bytes, 'double' type struct members start from addresses which are multiple of 8. But even if they started from addresses which are multiples of 4, we'd need 2 loads to bring them from memory. So I don't get the reason for the stricter constraint for starting address being a multiple of 8.

score 2 · Accepted Answer · answered Aug 29 '21 at 18:56

I am absolutely no expert, so if I'm wrong I'd love to know more too, but one reason I've seen to force double alignment on 8 bytes, is because of the cpu cache. If doubles were put on 4 byte alignments, the cache may only get half of the double and force more reads. By forcing alignment of 8 bytes, it makes sure that a single cache line is used to read the whole double.

This question is similar, why is data structure alignment important for performance? and some of the answers given may explain this better than I can for you.

score 1 · Answer 2 · answered Aug 29 '21 at 19:00

In the model the linked page presents, there is no reason to restrict the address of a double to a multiple of eight bytes. It gives the number of four-byte memory transfers as a reason for alignment, and eight bytes can be loaded in two transfers as long as they start on a four-byte-aligned address. There is no need for an eight-byte-aligned address. (It should come as no surprise that some web page on the Internet is not of high quality.)

However, there is no single definition of a “32-bit machine” or a “64-bit machine”. Processor and systems vary in several regards, including bus width (and hence basic memory transfer size), processor register width, virtual memory mapping features, instruction set. No single one of these makes a machine “32 bit” or “64 bit.”

A processor might require eight-byte-aligned addresses for a double simply because its instruction set encoding is designed not to have low bits for the address of a double. The “load double” instruction that loads a double into a floating-point register might not have any way of specifying the low three bits of an address in certain addressing forms; they are always taken to be zero.

Another issue could be the processor is largely a 32-bit processor, with 32-bit general registers, but has a 64-bit bus. Loads of 32-bit items to general registers only need to be four-byte aligned because the processor always loads some eight-byte-aligned 64 bits and then takes the high or low 32 bits. (Likely it also coalesces consecutive 32-bit load instructions when possible, so the full 64 bits are used.)

As another answer states, requiring eight-byte alignment for eight-byte objects prevents them from straddling cache lines or memory pages.

Padding logic of 'double' struct members on 32-bits machines

2 Answers2