Your asm does provide enough facilities to implement the usual procedure call / return sequence. You can push a return address and jump as a call
, and pop a return address (into a scratch location) and do an indirect jump to it as a ret
. We can just make call
and ret
macros. (Except that generating the correct return address is tricky in a macro; we might need a label (push ret_addr
), or something like set tmp, IP
/ add tmp, 4
/ push tmp
/ jump target_function
). In short, it's possible and we should wrap it up in some syntactic sugar so we don't get bogged down with that while looking at recursion.
With the right syntactic sugar, you can implement Fibonacci(n)
in assembly that will actually assemble for both x86 and your toy machine.
You're thinking in terms of functions that modify static (global) variables. Recursion requires local variables so each nested call to the function has its own copy of local variables. Instead of having registers, your machine has (apparently unlimited) named static variables (like x
and y
). If you want to program it like MIPS or x86, and copy an existing calling convention, just use some named variables like eax
, ebx
, ..., or r0
.. r31
the way a register architecture uses registers.
Then you implement recursion the same way you do in a normal calling convention, where either the caller or callee use push
/ pop
to save/restore a register on the stack so it can be reused. Function return values go in a register. Function args should go in registers. An ugly alternative would be to push them after the return address (creating a caller-cleans-the-args-from-the-stack calling convention), because you don't have a stack-relative addressing mode to access them the way x86 does (above the return address on the stack). Or you could pass return addresses in a link register, like most RISC call
instructions (usually called bl
or similar, for branch-and-link), instead of pushing it like x86's call
. (So non-leaf callees have to push the incoming lr
onto the stack themselves before making another call)
A (silly and slow) naively-implemented recursive Fibonacci might do something like:
int Fib(int n) {
if(n<=1) return n; // Fib(0) = 0; Fib(1) = 1
return Fib(n-1) + Fib(n-2);
}
## valid implementation in your toy language *and* x86 (AMD64 System V calling convention)
### Convenience macros for the toy asm implementation
# pretend that the call implementation has some way to make each return_address label unique so you can use it multiple times.
# i.e. just pretend that pushing a return address and jumping is a solved problem, however you want to solve it.
%define call(target) push return_address; jump target; return_address:
%define ret pop rettmp; jump rettmp # dedicate a whole variable just for ret, because we can
# As the first thing in your program, set eax, 0 / set ebx, 0 / ...
global Fib
Fib:
# input: n in edi.
# output: return value in eax
# if (n<=1) return n; // the asm implementation of this part isn't interesting or relevant. We know it's possible with some adds and jumps, so just pseudocode / handwave it:
... set eax, edi and ret if edi <= 1 ... # (not shown because not interesting)
add edi, -1
push edi # save n-1 for use after the recursive call
call Fib # eax = Fib(n-1)
pop edi # restore edi to *our* n-1
push eax # save the Fib(n-1) result across the call
add edi, -1
call Fib # eax = Fib(n-2)
pop edi # use edi as a scratch register to hold Fib(n-1) that we saved earlier
add eax, edi # eax = return value = Fib(n-1) + Fib(n-2)
ret
During a recursive call to Fib(n-1)
(with n-1
in edi
as the first argument), the n-1
arg is also saved on the stack, to be restored later. So each function's stack frame contains the state that needs to survive the recursive call, and a return address. This is exactly what recursion is all about on a machine with a stack.
Jose's example doesn't demonstrate this as well, IMO, because no state needs to survive the call for pow
. So it just ends up pushing a return address and args, then popping the args, building up just some return addresses. Then at the end, follows the chain of return addresses. It could be extended to save local state across each nested call, doesn't actually illustrate it.
My implementation is a bit different from how gcc compiles the same C function for x86-64 (using the same calling convention of first arg in edi, ret value in eax). gcc6.1 with -O1
keeps it simple and actually does two recursive calls, as you can see on the Godbolt compiler explorer. (-O2
and especially -O3
do some aggressive transformations). gcc saves/restores rbx
across the whole function, and keeps n
in ebx
so it's available after the Fib(n-1)
call. (and keeps Fib(n-1)
in ebx
to survive the second call). The System V calling convention specifies rbx
as a call-preserved register, but rbi
as call-clobbered (and used for arg-passing).
Obviously you can implement Fib(n) much faster non-recursively, with O(n) time complexity and O(1) space complexity, instead of O(Fib(n)) time and space (stack usage) complexity. It makes a terrible example, but it is trivial.