0

When programs enter the _start routine at the program start, is the stack pointer aligned to a 16 byte boundary, or should it be manually aligned? I mean, is it aligned even before the prologue (push rbp; mov rbp, rsp) in _start?

I know that on x86-64 at the start of the program RSP is aligned to 8 bytes, but I do now know if it's aligned to 16 bytes. For some tasks I might need that alignment to properly execute SSE instructions which require alignment on a 16 byte boundary.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Bulat M.
  • 680
  • 9
  • 25
  • 2
    Your `_start` shouldn't use that prologue, because it's not a function. It's your entry point, and `rbp` doesn't have a meaningful value, and the value at `[rsp]` isn't a return address (it's argc). If you want, you could run `mov rbp, rsp` to reference argc, argv, and the environment vars. – Peter Cordes Aug 31 '16 at 00:00

1 Answers1

4

The x86-64 ABI explicitly says (3.4.1 Initial Stack and Register State) :

%rsp The stack pointer holds the address of the byte with lowest address which is part of the stack. It is guaranteed to be 16-byte aligned at process entry.

Since _start is the first symbol that's called when a process is entered, you can be entirely sure that it is 16-byte aligned when the OS calls _start in your executable.

Daniel Kamil Kozar
  • 18,476
  • 5
  • 50
  • 64
  • 2
    In dynamically linked binaries, `ld.so` code is actually the first thing to run in your process, but you can count on it keeping the stack 16B-aligned. It does jump to the entry point of your process in the state specified by the ABI. (Note that most registers actually do hold garbage at that point, instead of being zeroed like Linux leaves them to prevent information leakage). See [this answer](http://stackoverflow.com/questions/36861903/assembling-32-bit-binaries-on-a-64-bit-system-gnu-toolchain/36901649#36901649) for more about building static or dynamic binaries without CRT start files. – Peter Cordes Aug 30 '16 at 23:58
  • @Daniel, and what about main function? If it is 16 byte aligned should I use prologue(_push rbp; mov rbp, rsp_) in it?(That will make 16 byte alignment to degrade to 8 byte alignment) – Bulat M. Aug 31 '16 at 06:10
  • 1
    @BulatM. : you are not required to use this prologue at all. You can omit it entirely, or just do `sub rsp, 16; mov [rsp], rbp; mov rbp, rsp` if you really think you need a "base pointer". – Daniel Kamil Kozar Aug 31 '16 at 07:31
  • @BulatM.: IIRC, the ABI doc recommends `push 0` / `mov rbp, rsp` (or `xor ebp,ebp` if you don't need a frame pointer yourself), to terminate the linked-list of frame pointers if later code does use frame pointers. – Peter Cordes Feb 03 '23 at 17:27