Are there performance issues when using pragma pack(1)?
Absolutely. In January 2020, Microsoft's Raymond Chen posted concrete examples of how using #pragma pack(1)
can produce bloated executables that take many, many more instructions to perform operations on packed structures. Especially on non-x86 hardware that doesn't directly support misaligned accesses in hardware.
Anybody who writes #pragma pack(1)
may as well just wear a sign on their forehead that says “I hate RISC”
When you use #pragma pack(1)
, this changes the default structure packing to byte packing, removing all padding bytes normally inserted to preserve alignment.
...
The possibility that any P structure could be misaligned has significant consequences for code generation, because all accesses to members must handle the case that the address is not properly aligned.
void UpdateS(S* s)
{
s->total = s->a + s->b;
}
void UpdateP(P* p)
{
p->total = p->a + p->b;
}
Despite the structures S and P having exactly the same layout, the
code generation is different because of the alignment.
UpdateS UpdateP
Intel Itanium
adds r31 = r32, 4 adds r31 = r32, 4
adds r30 = r32 8 ;; adds r30 = r32 8 ;;
ld4 r31 = [r31] ld1 r29 = [r31], 1
ld4 r30 = [r30] ;; ld1 r28 = [r30], 1 ;;
ld1 r27 = [r31], 1
ld1 r26 = [r30], 1 ;;
dep r29 = r27, r29, 8, 8
dep r28 = r26, r28, 8, 8
ld1 r25 = [r31], 1
ld1 r24 = [r30], 1 ;;
dep r29 = r25, r29, 16, 8
dep r28 = r24, r28, 16, 8
ld1 r27 = [r31]
ld1 r26 = [r30] ;;
dep r29 = r27, r29, 24, 8
dep r28 = r26, r28, 24, 8 ;;
add r31 = r30, r31 ;; add r31 = r28, r29 ;;
st4 [r32] = r31 st1 [r32] = r31
adds r30 = r32, 1
adds r29 = r32, 2
extr r28 = r31, 8, 8
extr r27 = r31, 16, 8 ;;
st1 [r30] = r28
st1 [r29] = r27, 1
extr r26 = r31, 24, 8 ;;
st1 [r29] = r26
br.ret.sptk.many rp br.ret.sptk.many.rp
...
[examples from other hardware]
...
Observe that for some RISC processors, the code size explosion is quite significant. This may in turn affect inlining decisions.
Moral of the story: Don’t apply #pragma pack(1)
to structures unless absolutely necessary. It bloats your code and inhibits optimizations.
#pragma pack(1)
and its variations are also subtly dangerous - even on x86 systems where they supposedly "work"