0

In this simple function, space is allocated for local variables. Then, variables are initialized and printf is called to output them.

000000000040056a <func>:
  40056a:       55                      push   rbp                     ; function prologue
  40056b:       48 89 e5                mov    rbp,rsp                 ; function prologue
  40056e:       48 83 ec 10             sub    rsp,0x10                ; deallocating space for local variables
  400572:       8b 4d fc                mov    ecx,DWORD PTR [rbp-0x4] ; variable initialization
  400575:       8b 55 f8                mov    edx,DWORD PTR [rbp-0x8] ; variable initialization
  400578:       8b 45 f4                mov    eax,DWORD PTR [rbp-0xc] ; variable initialization
  40057b:       89 c6                   mov    esi,eax                 ; string stuff
  40057d:       bf 34 06 40 00          mov    edi,0x400634            ; string stuff
  400582:       b8 00 00 00 00          mov    eax,0x0                 ; return value 
  400587:       e8 84 fe ff ff          call   400410 <printf@plt>     ; printf()
  40058c:       c9                      leave                          ; clean up local variables, pops ebp
  40058d:       c3                      ret                            ; return to the address that was pushed onto the stack (by popping it into eip)

What confuses me is this line sub rsp,0x10. How does the program know to allocate 0x10 bytes? Is it a guess? Is the program parsed before hand?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Happy Jerry
  • 164
  • 1
  • 8
  • 1
    That's *allocating* 10 bytes. You allocate at the start of a function. The stack grows downward. The compiler knows because it looked at the source code (or actually its internal representation of the logic) and saw how many things need what size of stack space, and also that RSP has to be 16-byte aligned before the `call`. Note that in 64-bit code, `leave` pops RBP, not EBP, and that `ret` pops into RIP, not EIP. – Peter Cordes Jan 19 '22 at 23:07
  • @PeterCordes Fixed. I had a brain fart. And ah, yes. Thank you . I saw that they were using 32 bit general purpose registers and made a mistake. My bad. Additionally, for functions, are there cases where instructions are not executed on the allocated space for the local variables. For example, suppose a function allocates 0x10 bytes by `sub rsp, 0x10` , but then the next instruction pushes something onto the stack causing the stack to grow? Is this possible? Does this even occur – Happy Jerry Jan 19 '22 at 23:21
  • 1
    Yeah, if the first thing a function does is call another function with more than 6 args, or with a large struct. (modern GCC/clang won't use `push` for initializing local vars at the same time they reserve space; pretty much only for save/restore of call-preserved regs (which they do before reserving space for locals) or for passing stack args). e.g. https://godbolt.org/z/qv11z6nx4 happens to do that, with a sub for stack alignment. Or https://godbolt.org/z/GKejKn6vx shows actual space reserved for a variable, used after the call. Both gcc -O0 and -O3 do sub rsp, imm / push – Peter Cordes Jan 19 '22 at 23:27
  • 1
    The programmer decides what data types to use for variables, choosing among the available ones in the given programming language. Those data types determine the quantity of physical storage needed, and the compiler (or assembly programmer) allocates that physical storage. There's no guessing. Yes, an entire function is parsed/seen by the compiler before it generates code; an assembly programmer would know the intent of a whole function before writing the assembly for it as well, so as to do the same: allocation of physical storage for logical variables in the algorithm. – Erik Eidt Jan 20 '22 at 00:12

1 Answers1

5

The compiler knows because it looked at the source code (or actually its internal representation of the logic after parsing it) and added up the total size needed for all the things that it had to allocate stack space for. And also it has to get RSP 16-byte aligned before the call, given that RSP % 16 == 8 on function entry.

So alignment is one reason compilers may reserve more than the function actually uses, but also compiler missed-optimization bugs can make it waste space: common for GCC to waste an extra 16 bytes, although that's not happening here.

Yes, modern compilers parse the entire function (actually whole source file) before emitting any code for it. That's kind of the point of an ahead-of-time optimizing compiler, so it's designed around doing that, even if you make a debug build. By comparison, TCC, the Tiny C Compiler, is one-pass, and leaves a spot in its function prologue to go back later and fill in whatever total size after getting to the bottom of the function in the source code. See Tiny C Compiler's generated code emits extra (unnecessary?) NOPs and JMPs - when that number happens to be zero, there's still a sub esp, 0 there. (TCC only targets 32-bit mode.)

Related: Function Prologue and Epilogue in C


In leaf functions, compilers can use the red zone below RSP when targeting the x86-64 System V, avoiding the need to reserve as much (or any) stack space even if there are some locals they choose to spill/reload. (e.g. any at all in unoptimized code.) See also Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets? Except for kernel code, or other code compiled with -mno-red-zone.

Or in Windows x64, callers need to reserve shadow space for their callee to use, which also gives small functions the chance to not spend any instructions moving RSP around, just using the shadow space above their return address. But for non-leaf functions, this means reserving at least 32 bytes of shadow space plus any for alignment or locals. See for example Shadow space example

In standard calling conventions for ISAs other than x86-64, other rules may come into play that affect things.


Note that in 64-bit code, leave pops RBP, not EBP, and that ret pops into RIP, not EIP.

Also, mov ecx,DWORD PTR [rbp-0x4] is not variable initialization. That's a load, from uninitialized memory into a register. Probably you did something like int a,b,c; without initializers, then passed them as args to printf.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I think it's also worth pointing out that this is in many cases determined by the ABI. Some ABIs will use less stack than others. While others will save code size (`__stdcall` on x86-32/16). – Mgetz Jan 20 '22 at 13:45
  • @Mgetz: True. I think I ended up leaving that out because both x86-64 SysV and Windows x64 do the same 16-byte stack alignment. They differ in red-zone vs. shadow space, though, both of which can avoid allocating extra stack space in leaf functions, while Windows x64 shadow space requires allocating extra in non-leaf funcs. – Peter Cordes Jan 20 '22 at 15:29
  • Fair, I also didn't want to go to much into it because things like the `__stdcall` decision by Microsoft have interesting impacts on stack usage but aren't immediately relevant to the question as its directly related to function calls and not locals. But I do think it is worth mentioning because it does affect stack space for locals. – Mgetz Jan 20 '22 at 15:38
  • 1
    @Mgetz: Sure, edited with a couple new paragraphs with links. – Peter Cordes Jan 20 '22 at 15:52