0

I compiled the following c++ code in compiler explorer and the arguments passed to the function was moved to rbp - 4 and sequentially onward if a return statement was included and from rbp - 8 if not. Why does this happen. Is it a type of optimization or is something else occupying the pbp - 4 position ?

int foo(int a, int b, int c)
{
    int d = 5;
    int e = a + b + c + d;
    //return e;
}

results in :

foo(int, int, int): # @foo(int, int, int)
  push rbp
  mov rbp, rsp
  mov dword ptr [rbp - 8], edi
  mov dword ptr [rbp - 12], esi
  mov dword ptr [rbp - 16], edx
  mov dword ptr [rbp - 20], 5
  mov edx, dword ptr [rbp - 8]
  add edx, dword ptr [rbp - 12]
  add edx, dword ptr [rbp - 16]
  add edx, dword ptr [rbp - 20]
  mov dword ptr [rbp - 24], edx
  ud2

and so with the return statement

foo(int, int, int): # @foo(int, int, int)
 push rbp
 mov rbp, rsp
 mov dword ptr [rbp - 4], edi
 mov dword ptr [rbp - 8], esi
 mov dword ptr [rbp - 12], edx
 mov dword ptr [rbp - 16], 5
 mov edx, dword ptr [rbp - 4]
 add edx, dword ptr [rbp - 8]
 add edx, dword ptr [rbp - 12]
 add edx, dword ptr [rbp - 16]
 mov dword ptr [rbp - 20], edx
 mov eax, dword ptr [rbp - 20]
 pop rbp
 ret

Why is there a change in stack offset ?

The compiler explorer was set to x86-64 clang 6.0.0 and these options (symbols as viewed) : .LX0: , .text , // , \s+ , Intel , Demangle turned on

EDIT: When I changed the return type of foo to void the arguments are moved to rbp - 4 onward sequentially as expected.

foo(int, int, int): # @foo(int, int, int)
  push rbp
  mov rbp, rsp
  mov dword ptr [rbp - 4], edi
  mov dword ptr [rbp - 8], esi
  mov dword ptr [rbp - 12], edx
  mov dword ptr [rbp - 16], 5
  mov edx, dword ptr [rbp - 4]
  add edx, dword ptr [rbp - 8]
  add edx, dword ptr [rbp - 12]
  add edx, dword ptr [rbp - 16]
  mov dword ptr [rbp - 20], edx
  pop rbp
  ret

And the function declared as void and with no return statement produces following assembly with x86-64 gcc 7.3. the stack positions are not the same as that of clang and not sequential.

foo(int, int, int):
  push rbp
  mov rbp, rsp
  mov DWORD PTR [rbp-20], edi
  mov DWORD PTR [rbp-24], esi
  mov DWORD PTR [rbp-28], edx
  mov DWORD PTR [rbp-4], 5
  mov eax, DWORD PTR [rbp-24]
  mov edx, DWORD PTR [rbp-20]
  add eax, edx
  add eax, DWORD PTR [rbp-28]
  add eax, DWORD PTR [rbp-4]
  mov DWORD PTR [rbp-8], eax
  pop rbp
  ret

So what causes this ?

WARhead
  • 643
  • 5
  • 17
  • It's the opposite of optimizaton. You forgot to enable it. Add `-O2` or similar, otherwise you will be looking at garbage code. Note that the compiler figured the first version is invalid (undefined behavior) so you can't deduce much from that, optimized or not. – Jester Mar 22 '18 at 17:09
  • @Jester when I gave `void foo(...` the arguments are moved from `rbp - 4` onwards. Why does that happen? – WARhead Mar 22 '18 at 17:13
  • I guess a side note to what @Jester said: by standard having a function with a return type, without returning something is considered undefined behavior. It seems though, the behavior is defined on the compiler level. Just not what we wanted and potentially bad. – M4rc Mar 22 '18 at 17:14
  • Not returning from a function that is supposed to return is undefined behavior so the compiler can do whatever it wants. – NathanOliver Mar 22 '18 at 17:25
  • @NathanOliver what about the assembly by gcc for function declared void – WARhead Mar 22 '18 at 17:27
  • The code generator of a compiler is simplistic, it doesn't make any effort to generate smart addresses. It probably reserved -4 for a possible return statement somewhere in the middle of the code. The optimizer does the real job. – Hans Passant Mar 22 '18 at 17:27
  • @HansPassant in the gcc compiler, it did use `rbp - 4` and `rbp - 8` but next use is `rbp - 20` (sequentially) and all these datatypes are int ie. 4 bytes – WARhead Mar 22 '18 at 17:29
  • it probably reserved space for the intermediary sums. That is how dumb they are. Not a problem. – Hans Passant Mar 22 '18 at 17:33
  • @HansPassant not a problem. I am only starting to look at assembly code and could not understand why this happens – WARhead Mar 22 '18 at 17:35
  • Re-read my first comment. – Jester Mar 22 '18 at 17:41
  • 1
    when I add -O2 it optimises away the whole code to just the `ret` statement – WARhead Mar 22 '18 at 17:42
  • 2
    Yeah. Because your function doesn't do anything worthy to generate code for. – Jester Mar 22 '18 at 17:45
  • BTW, the `ud2` instead of `ret` is because it's undefined behaviour to reach the end of a non-void function. clang chooses to make the failure noisy instead of silently returning whatever garbage was left in `rax`. It would also warn about this at compile time. – Peter Cordes Mar 22 '18 at 18:05
  • 1
    If you try to understand the assembly, you'll see that the extra slot is to hold the return value. Of course you are seeing this only because you have turned off all optimizations. – Raymond Chen Mar 22 '18 at 20:02
  • @RaymondChen: Another recent question confirms that clang `-O0` does reserve space on the stack specifically for the return value, even when it's constant. So this question, returning a variable that already has its own stack space, is basically the same bit of compiler internals, I assume. – Peter Cordes Mar 23 '18 at 02:40
  • @PeterCordes the absence of `rbp - 4 ` can be agreed as reserved for the return statement (which is not there) but why `rbp - 20` when `rbp - 12` or `rbp - 16` is (I assume) available? – WARhead Mar 23 '18 at 13:36
  • I don't know, but that's a separate question about gcc internals instead of clang. Without optimization, maybe it doesn't try to avoid using a bit of extra stack space, or something about how it rounds total stack allocations to keep the stack aligned, but then doesn't actually adjust RSP because it can use the red zone? Read the gcc source code (or GIMPLE and/or RTL internal representations for your program) if you really want to understand why it leaves extra space when you told it not to try to optimize anything. – Peter Cordes Mar 23 '18 at 13:44

0 Answers0