3

Please view the following code snippets:

int& sum(int& num1, int& num2) {
    num1++;
    num2++;
}

00000000 <_Z3sumRiS_>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   e8 fc ff ff ff          call   4 <_Z3sumRiS_+0x4>    // why here is a jump
   8:   05 01 00 00 00          add    $0x1,%eax     

   // why 0x8, my understanding is there are in total 3 parameters
   // num2 -- 0xc(%ebp), num1 -- 0x8(%ebp), this -- 0x4(%ebp)
   // am I right????
   d:   8b 45 08                mov    0x8(%ebp),%eax

  10:   8b 00                   mov    (%eax),%eax
  12:   8d 50 01                lea    0x1(%eax),%edx        // what the heck is this?
  15:   8b 45 08                mov    0x8(%ebp),%eax
  18:   89 10                   mov    %edx,(%eax)
  1a:   8b 45 0c                mov    0xc(%ebp),%eax
  1d:   8b 00                   mov    (%eax),%eax
  1f:   8d 50 01                lea    0x1(%eax),%edx
  22:   8b 45 0c                mov    0xc(%ebp),%eax
  25:   89 10                   mov    %edx,(%eax)
  27:   90                      nop
  28:   5d                      pop    %ebp
  29:   c3                      ret    

I need to figure out the meaning of every single line of it, kinda confused me.

Edee
  • 1,746
  • 2
  • 6
  • 14
  • That isn't a C++ member function so there's no `this` argument. The stack slot above the saved EBP is the return address. Also, it has undefined behaviour because you fall off the end of a non-`void` function. GCC `-O0` seems to on purpose evaluate the last expression with side effects in the return-value register, so this [sort of works as returning `num2++`](https://codegolf.stackexchange.com/questions/2203/tips-for-golfing-in-c#comment403454_106067) if you only care about `gcc -O0`. It's totally broken for any other use. – Peter Cordes May 05 '20 at 05:44
  • @PeterCordes That was a C++ code. I used it as an experience. No return value on purpose. – Edee May 05 '20 at 07:54
  • As for `what the heck is this?`, `lea 0x1(%eax),%edx` loads the value that `eax` points to, adds one, then stores the result in `edx`. IOW, this is part of your `num1++;` - the store to memory that follows completes it. – 500 - Internal Server Error May 05 '20 at 07:58
  • Looking at compiler output from code with Undefined Behaviour could be confusing, I don't recommend it as the first thing you start with. See [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) - https://godbolt.org/ can match up source lines with asm blocks. – Peter Cordes May 05 '20 at 08:01
  • I know this is C++, the `&` reference makes that clear. But it's not a *member* function so there's no `this` pointer. If you did `auto tmp = this;` in that function, it would be a compile-time error. – Peter Cordes May 05 '20 at 08:03
  • yeah, this is not a member function, I just don't get why the assembly starts from `0x8(%ebx)`, rather than `0x4(%ebx)` @PeterCordes – Edee May 05 '20 at 08:05
  • It's using EBP as the frame pointer, not EBX. [What are the ESP and the EBP registers?](https://stackoverflow.com/a/60773337) - `4(%ebp)` is the return address, `8(%ebp)` is the first stack arg. – Peter Cordes May 05 '20 at 08:11
  • @500-InternalServerError If what you said was right, then what `add $0x1,%eax` does? – Edee May 05 '20 at 08:11
  • @PeterCordes I think EBP is the base pointer. why is `mov 0x8(%ebp),%eax`, what's the meaning of this line. Why it use ebp to locate `0x8(%ebp)` – Edee May 05 '20 at 08:13
  • Did you read the rest of my last comment? I linked you an answer that explains it. (And I just edited my comment to say more.) – Peter Cordes May 05 '20 at 08:14
  • @PeterCordes Just saw it. Thx. I know how they work. My point was `add $0x1,%eax` – Edee May 05 '20 at 08:17
  • 1
    Also, the `call` / `add` is because this is a 32-bit PIE, and with optimization disabled GCC is making code to get a pointer to the GOT. And you're looking at disassembly of a `.o` so the linker hasn't filled in offsets yet. Use `objdump -drwC` (and preferably `-Mintel` unless you actually like AT&T syntax). Or much better, look at compiler asm output instead of disassembly from binary. – Peter Cordes May 05 '20 at 08:17
  • @PeterCordes OK, got it. – Edee May 05 '20 at 08:17

1 Answers1

3
   3:   e8 fc ff ff ff          call   4 <_Z3sumRiS_+0x4>

This isn't the real destination of the call, it is something will be filled in by linker. If you run objdump -dr sum.o, you will find it is actually a call to __x86.get_pc_thunk.ax. Same for the following add, to set up a pointer to the GOT. (This function doesn't need one but you compiled without optimization, with -fpie on by default.)

For more details, take a look Why does gcc generates strange code without flag -fno-pie?


System V i386 ABI, Section 2.2.2 tells the structure of a stack frame.

System V i386 ABI Table 2.2

So your stack frame looks like this:

0xc  |      num2      |
0x8  |      num1      |
0x4  | return address |
0x0  | previous %ebp  |  <-- %ebp

For remaining instructions, here is a step-by-step analysis.

// as num1 and num2 are references, they represents address in assembly
   d:   8b 45 08                mov    0x8(%ebp),%eax        // load num1 to %eax
  10:   8b 00                   mov    (%eax),%eax           // load *num1 to %eax
  12:   8d 50 01                lea    0x1(%eax),%edx        // put *num1 + 1 into %edx
  15:   8b 45 08                mov    0x8(%ebp),%eax        // load num1 to %eax
  18:   89 10                   mov    %edx,(%eax)           // save *num1 + 1 at num1
  1a:   8b 45 0c                mov    0xc(%ebp),%eax        // same as above
  1d:   8b 00                   mov    (%eax),%eax
  1f:   8d 50 01                lea    0x1(%eax),%edx
  22:   8b 45 0c                mov    0xc(%ebp),%eax
  25:   89 10                   mov    %edx,(%eax)
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
KagurazakaKotori
  • 562
  • 4
  • 14