1

I have been trying to work out what the following recursive function does:

func4:
0x08048cfa <+0>:    push   edi
0x08048cfb <+1>:    push   esi
0x08048cfc <+2>:    push   ebx
0x08048cfd <+3>:    mov    ebx,DWORD PTR [esp+0x10] // First arg
0x08048d01 <+7>:    mov    edi,DWORD PTR [esp+0x14] // Second arg
0x08048d05 <+11>:   test   ebx,ebx // if (ebx == 0) { eax = 0; return ???;}
0x08048d07 <+13>:   jle    0x8048d34 <func4+58>
0x08048d09 <+15>:   mov    eax,edi
0x08048d0b <+17>:   cmp    ebx,0x1 // if (ebx == 1) {return ???;} 
0x08048d0e <+20>:   je     0x8048d39 <func4+63>
0x08048d10 <+22>:   sub    esp,0x8
0x08048d13 <+25>:   push   edi
0x08048d14 <+26>:   lea    eax,[ebx-0x1]// eax = ebx-1
0x08048d17 <+29>:   push   eax
0x08048d18 <+30>:   call   0x8048cfa <func4>
0x08048d1d <+35>:   add    esp,0x8 // esp += 8
0x08048d20 <+38>:   lea    esi,[edi+eax*1] // esi = edi + eax
0x08048d23 <+41>:   push   edi
0x08048d24 <+42>:   sub    ebx,0x2 // ebx -= 2
0x08048d27 <+45>:   push   ebx
0x08048d28 <+46>:   call   0x8048cfa <func4>
0x08048d2d <+51>:   add    esp,0x10 // esp += 10
0x08048d30 <+54>:   add    eax,esi // eax += esi
0x08048d32 <+56>:   jmp    0x8048d39 <func4+63>
0x08048d34 <+58>:   mov    eax,0x0 // eax = 0
0x08048d39 <+63>:   pop    ebx
0x08048d3a <+64>:   pop    esi
0x08048d3b <+65>:   pop    edi
0x08048d3c <+66>:   ret

To date, I have figured out that it takes ebx, decrements it by one, passes it back to itself and recurses until it hits one of the base cases, then moves on to the next step of the recursion. However, I haven't fully understood what that branch of the recursion does, or what esp is doing in this context.

Any hints as to how to proceed? I have already stepped through it quite a few times with gdb, but have not really noticed any sort of pattern that would help me determine what is happening.

1 Answers1

2

It seems that you don't know that the result is returned in the eax register. With that in mind the code is not difficult to understand. Assuming that the cdecl calling convention is used (because the stack is cleaned up by caller), it is same as this js function:

function func4(a, b)
{
    if (a <= 0) return 0;
    if (a == 1) return b;
    return b + func4(a-1, b) + func4(a-2, b);
}

and is the asm code with comments

func4:
0x08048cfa <+0>:    push   edi              ; save non-volatile registers
0x08048cfb <+1>:    push   esi
0x08048cfc <+2>:    push   ebx
0x08048cfd <+3>:    mov    ebx, [esp+0x10]  ; ebx <- a
0x08048d01 <+7>:    mov    edi, [esp+0x14]  ; edi <- b
0x08048d05 <+11>:   test   ebx, ebx         ; if (a <= 0)
0x08048d07 <+13>:   jle    0x8048d34        ;   return 0
0x08048d09 <+15>:   mov    eax, edi         ; result <- 0
0x08048d0b <+17>:   cmp    ebx, 0x1         ; if (a == 1)
0x08048d0e <+20>:   je     0x8048d39        ;   return result;
0x08048d10 <+22>:   sub    esp, 0x8         ; this is useless
0x08048d13 <+25>:   push   edi              ; passing 2nd arguments
0x08048d14 <+26>:   lea    eax, [ebx-0x1]   ; 
0x08048d17 <+29>:   push   eax              ; passing 1st arguments
0x08048d18 <+30>:   call   0x8048cfa<func4> ; ax = func4(a - 1, b)
0x08048d1d <+35>:   add    esp, 0x8         ; clean up the stak after calling
0x08048d20 <+38>:   lea    esi, [edi+eax*1] ; temp = b + func4(a - 1, b)
0x08048d23 <+41>:   push   edi              ; passing 2nd arguments
0x08048d24 <+42>:   sub    ebx, 0x2         ;
0x08048d27 <+45>:   push   ebx              ; passing 1st arguments
0x08048d28 <+46>:   call   0x8048cfa<func4> ; ax = func4(a - 2, b)
0x08048d2d <+51>:   add    esp, 0x10        ; clean up the stak and the useless 8 bytes
0x08048d30 <+54>:   add    eax, esi         ; result = func4(a - 2, b) + temp
0x08048d32 <+56>:   jmp    0x8048d39        ; 
0x08048d34 <+58>:   mov    eax, 0x0         ; jump to here when a <= 0
0x08048d39 <+63>:   pop    ebx
0x08048d3a <+64>:   pop    esi
0x08048d3b <+65>:   pop    edi
0x08048d3c <+66>:   ret

LEA is meant for calculating memory offsets, but it is widely used to doing fused multiplication and addition because it is quick and convenient. Two more advantages are: 1) you can assign the result to a register different from the source registers; 2) it doesn't affect the flags.

W. Chang
  • 494
  • 3
  • 10
  • 1
    LEA's name comes from using the CPU's ability to decode addressing modes, not because it's *meant* only for use with valid pointers. [Using LEA on values that aren't addresses / pointers?](https://stackoverflow.com/a/46597375). We don't know what the architect of 8086 (Stephen Morse) was thinking when he added it; he might well have intended to expose the ability to add 2 regs into a different destination for arbitrary use. – Peter Cordes Oct 10 '18 at 01:42
  • @PeterCordes Base of the name (Load effective address), the requirement that the source must be a memory operand, and the fact that it doesn't affect the flags and is listed in "Address object transfers" section in the 8086 manual, I will be surprised if its original intention is not for calculation address. Of course no one except the architect can be sure about this. But the original intention means little to programmers; they always find the most efficient way to do things. `xor eax, eax` and `or eax, -1` are good examples. – W. Chang Oct 11 '18 at 03:04
  • Interesting, the part about where it's listed in the 8086 manual is new to me. The other factors are things I mentioned in my answer that I linked above, see that for my detailed argument on this. (Interesting point though about not setting flags like other ALU being a sign of design intent. Unless that was just an implementation detail that went with using address-decoding logic in 8086.) Anyway, I'm only somewhat interested in history, mostly in how it's most useful to think to the current instruction. You can and should think of it as an integer shift-and-add since 386 scaled-index. – Peter Cordes Oct 11 '18 at 03:19
  • @PeterCordes I agree. I am mainly using it for addition anyway. – W. Chang Oct 11 '18 at 07:14