1

I'm just working through Nick Desaulniers: Let's Write Some X86-64 File h4.s:

.text
.globl main
 main:
#  subq $8, %rsp
movq $0, %rdi
call _exit

He's running on a Mac and says running the above will give a segmentation fault. I'm running on opensuse 13.1 just calling

gcc h4.s

to compile and link. I don't get a seg fault when the stack pointer is adjusted or when the line is commented out. Does anyone know why not? Doesn't the SP need to be aligned to a 16 byte boundary?

Rich Oliver
  • 6,001
  • 4
  • 34
  • 57
  • 2
    The Linux kernel does not require nor impose alignment of the stack pointer to a 128-bit (giant)word boundary. –  Aug 12 '14 at 07:15
  • Hi Rich, Sami has the correct answer above. – Nick Desaulniers Aug 19 '14 at 00:47
  • The kernel doesn't use the user-space stack pointer so that's completely irrelevant. The real answer is that the `_exit` wrapper function happens not to do any `movaps` to/from the stack, just `mov $60, %eax` / `syscall`. Unlike for example scanf: [glibc scanf Segmentation faults when called from a function that doesn't align RSP](https://stackoverflow.com/q/51070716) – Peter Cordes Aug 09 '20 at 15:16

1 Answers1

4

The x86-64 System V ABI requires the stack pointer to be 8 mod 16 bytes on entry to a routine (16B aligned before the call, 16+8 after with a return address pushed on the stack).

Only SSE instructions require this (like movaps / movdqa), which most library functions happen not to use. Especially simple ones like _exit could be as simple as mov $231, %eax / syscall.

However, if the stack is misaligned and someone at some point wants to do something that is based on the assumption that they have a 16B aligned stack. For example, issue an aligned xmm instructions like "movdqa [rsp],...", then you can get an actual seg fault. Or hypothetically some other error for some other kind of assumption of stack alignment.

In summary : Simply misaligning the stack before a call will usually not fault.

Just like C undefined behaviour, it's not required to fail if you violate the rules, but it can fail. And what happens to work now might break in the future or on other systems.

Compilers are allowed to use SSE anywhere to copy 16 bytes at a time to / from local variables, because of the ABI guarantee, and because x86-64 guarantees at least SSE2.

For example, glibc scanf Segmentation faults when called from a function that doesn't align RSP - modern builds of glibc include a movaps to copy 16 bytes at once to a local struct or array. Older builds of glibc didn't require stack alignment (when you properly set AL=0).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
drivingon9
  • 676
  • 4
  • 6