The x86-64 System V ABI requires the stack pointer to be 8 mod 16 bytes on entry to a routine (16B aligned before the call
, 16+8 after with a return address pushed on the stack).
Only SSE instructions require this (like movaps
/ movdqa
), which most library functions happen not to use. Especially simple ones like _exit
could be as simple as mov $231, %eax
/ syscall
.
However, if the stack is misaligned and someone at some point wants to do something that is based on the assumption that they have a 16B aligned stack. For example, issue an aligned xmm instructions like "movdqa [rsp],...", then you can get an actual seg fault. Or hypothetically some other error for some other kind of assumption of stack alignment.
In summary : Simply misaligning the stack before a call will usually not fault.
Just like C undefined behaviour, it's not required to fail if you violate the rules, but it can fail. And what happens to work now might break in the future or on other systems.
Compilers are allowed to use SSE anywhere to copy 16 bytes at a time to / from local variables, because of the ABI guarantee, and because x86-64 guarantees at least SSE2.
For example, glibc scanf Segmentation faults when called from a function that doesn't align RSP - modern builds of glibc include a movaps
to copy 16 bytes at once to a local struct or array. Older builds of glibc didn't require stack alignment (when you properly set AL=0).