Assembly language and Fibonacci numbers

Question

To keep this simple, I have received my first MASM32 project and it goes as follows: "Write an assembly program to generate and display the first 24 Fibonacci numbers, beginning with 1 and ending with 46368." Since this is my first project I honestly have no idea how to even begin or write this code out. Any help would be much appreciated. Thank you so much.

You could go the super easy way and just create data containing the first 24 numbers and print them out. ;) — David Hoelzer, Mar 29 '16 at 00:08
Try writing the program in another language like C first, then see if you can write it again in assembly. — rcgldr, Mar 29 '16 at 00:31
@DavidHoelzer: In this case the assignment wording says the numbers have to be generated by the assembly program. — Peter Cordes, Mar 29 '16 at 00:37
There's always the old write a flow chart based on an algorithm approach. — David Hoelzer, Mar 29 '16 at 00:41
Where to begin: Look at compiler output, and if you're still stuck then search for [`[x86] fibonacci`](http://stackoverflow.com/search?q=%5Bx86%5D+fibonacci) here on stackoverflow. For the calculation part, I'd recommend [my answer](http://stackoverflow.com/questions/32659715/assembly-language-x86-how-to-create-a-loop-to-calculate-fibonacci-sequence/32661389#32661389) on another Fibonacci question for an efficient loop. It starts simple, then gets complicated, so just stop reading once you get in over your head. It stores to an array, but you could change that to `printf` calls. — Peter Cordes, Mar 29 '16 at 00:42
That duplicate also wants to print the series in 32bit x86, with a call to `_printf`. Or maybe just prints the last value, but does demonstrate using printf. The code is horribly inefficient, storing all temporaries to memory, though. — Peter Cordes, Mar 29 '16 at 00:51

Tommylee2k · Answer 1 · 2016-03-29T16:41:00.703

fibunacci is usually tended to be solved recursively, while solving it with a simple loop is as easy

you already know your array will be 24 words large,
and on the first look, it looks like your loop will be 24 steps

while f(1) and f(2) are already set, and you need at least these 2 elements to work with, the number of elements needed is 22, not 24 ... so let's start with what we already know

fib  dw 24 dup (0)    ; i don't know masm too well, this should create 24 words
    mov word ptr[fib  ],1   ; f(1)
    mov word ptr[fib+2],1   ; f(2)

    mov cx,22           ; this will be the loop counter
l1:
    ; add up the last 2 elements
    dec cx
    jnz l1

ok, so far so good. "add up last 2 elements" BEGS for 2 registers, where the last two elements are stored, to not always have to read them from memory. so let's use AX and BX, AX for element n-2, BX for n-1. we add them up, and store the result in the 'next' array element.
for arrays, it's great to use si and di, while we're writing, DI (destination index) is our best choice

fib  dw 24 dup (0)    ; i don't know masm too well, this should create 24 words

calculate:
    cld
    mov word ptr[fib  ],1   ; f(1)
    mov ax, 1   
    mov word ptr[fib+2],1   ; f(2)
    mov bx, 1
    mov di, offset fib+4    ; does this work in MASM ?

    mov cx,22           ; this will be the loop counter
l1:
    add ax,bx
    stosw           ; write new element to array   
                    ; note: this is equivalent to mov [di], AX and add si,2

    dec cx
    jnz l1

after adding up, we need to load ax and bx with the correct values, ax with f(n) and bx with f(n-1) ... we could do that with reading from array, but if we take a closer look, AX is f(n) (last element just computed) for this loop, which will be f(n-1) for next loop, and BX is f(n-1) , which will be f(n-2) for the next loop. So all we have to do is just swap them

fib  dw 24 dup (0)    ; i don't know masm too well, this should create 24 words

calculate:
    cld
    mov word ptr[fib  ],1   ; f(1)
    mov ax, 1   
    mov word ptr[fib+2],1   ; f(2)
    mov bx, 1
    mov di, offset fib+4    ; does this work in MASM ?

    mov cx,22           ; this will be the loop counter
l1:
    add ax,bx
    stosw           ; write new element to array   
                    ; note: this is equivalent to mov [di], AX and add si,2
    xchg ax,bx      

    dec cx
    jnz l1

et voilà : you have your 24 elements computed

PS: i'm not sure MASM will compile this, i'm using another assembler, give me a hint and i'll correct it

push/pop? What the hell? `xchg ax,bx` does the same thing. For a loop-end condition, you could `cmp edi, 24*2 + OFFSET fib` or something instead of wasting ecx on a counter. Of course, you don't have to store the results at all: calling `printf` from inside the loop is the easiest way (keep loop state in call-preserved registers). And why are you using 16bit pointers? The OP seems to be asking specifically about 32bit. — Peter Cordes, Mar 29 '16 at 14:59
Also, Fibonacci is often implemented recursively, but that's only because people pick it as an example of a recursive function when teaching recursion. It's harder and *much* slower to implement recursively. (O(Fib(n)) instead of O(n), for a simple recursive implementation without memoization.) Unrolling by two means even less moving data around: `add ebx, edi` / print ebx / `add edi, ebx` / print edi does the trick. Anyway, IMO people should teach recursion using a function where recursion makes sense and a loop requires a stack-like data structure. tree traversal is a good example. — Peter Cordes, Mar 29 '16 at 15:05
I could use "pointer" comparison, but CX's not needed, and I like using registers as a counter, so why should I. This way it's more readble (at least for my eyes, but that's personal favour). But you're completely right about exchange, i keep forgetting instructions (too many instruction sets don't help remembering which set contains which instruction), I've changed it — Tommylee2k, Mar 29 '16 at 16:44
I don't like introducing counters that aren't used for anything. `cmp/jcc` can fuse on AMD and Intel even pre-SnB, but `dec/jcc` can only fuse on Intel SnB-family. Comparing to an end-pointer still ties up a register to hold the end pointer, so you're right that the advantage tends to be small. (It's rare that the end point can be an immediate in real code, since pointer increments are generally more efficient than indexed addressing modes, esp on Intel SnB-family). However, decrementing an index is great for loops with indexing that can traverse an array backwards. — Peter Cordes, Mar 29 '16 at 16:53
BTW, `xchg` is 3 uops on Intel. You could use `lea esi, [edi, ebx]` / `mov edi, ebx` / `mov ebx, esi` or something if you want to avoid unrolling the loop to get rid of the extra moves. — Peter Cordes, Mar 29 '16 at 16:58

Assembly language and Fibonacci numbers

1 Answers1