0

I am having trouble creating a x64 masm program that computes the result of the sum of the odd values. The answer is in RAX as 109E82h (1089154d). I am having difficulty trying to figure out how masm works, but it is so confusing. This is what I have tried so far, but the value in the RAX register is incorrect. I have tried as much as I can, but I do not know where I am going wrong.


extrn ExitProcess: proc

.data
fib1 QWORD 0
fib2 QWORD 1
sum QWORD 0
  
.code
_main PROC

    ; calculate the Fibonacci sequence up to 10^7
    mov rax, 10000000
    mov rcx, fib1
    mov rdx, fib2
FIB_LOOP:
    add rcx, rdx
    mov fib1, rdx
    mov fib2, rcx
    cmp rcx, rax
    jle FIB_LOOP
    ; add odd numbers to the sum
    mov rcx, fib1
    add rcx, rdx  ; rdx contains the last even Fibonacci number
SUM_LOOP:
    test cl, 1    ; check if the current number is odd
    jz SKIP_ADD
    add sum, rcx
SKIP_ADD:
    add rcx, rdx
    mov rdx, rcx
    cmp rcx, rax
    jle SUM_LOOP

    ; exit the program
    mov rax, sum
    call ExitProcess

_main ENDP
END
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Try debugging it with a small series: first just one element, then two.l – Erik Eidt Apr 07 '23 at 15:16
  • 1
    You don't need any memory variables for this, there's enough registers for the whole algorithm. – Erik Eidt Apr 07 '23 at 17:11
  • Related: [MASM x86 Computing the sum of odd numbers Fibonacci sequence between 0 and 1000000](https://stackoverflow.com/q/75955107) has some hints in comments under someone else's (worse) attempt at the same homework. But they asked a better SO question, explaining what they did get when they ran it vs. what they expected. That's an important part of a [mcve]. Just saying "doesn't work" is not helpful. – Peter Cordes Apr 07 '23 at 19:55

1 Answers1

1

but I do not know where I am going wrong.

Well, I don't understand why the first part of your program is calculating the Fibonacci sequence up to 10^7, when the title mentions you need to deal with numbers from the range [0,10^6]. That is 10 times higher than what is needed! And after that, the sum loop goes into even bigger numbers, luckily the RAX limit wasn't changed and so that part exits soon.

  • You need to sum the odd Fibonacci numbers as you run through the loop that creates them.
  • You can abstain from using memory-based variables in this code. You have plenty of registers to use.
  • Although you want 64-bit code, you don't always need to use 64-bit registers. The limited numerical range of the task allows you to employ the more efficient 32-bit registers. Since writing to the low dword of a 64-bit register automatically zeroes the high dword, in the end it is RAX that will hold the result 1089154.
  xor  eax, eax     ; RAX=0 is Number1 (Fibonacci)
  lea  ecx, [rax+1] ; RCX=1 is Number2
  cdq               ; RDX=0 is SumOfOdds

More:
  xchg eax, ecx     ; -> RCX = 0,  1,  1,  2,  3,  5,   8, ...
  add  eax, ecx     ; -> RAX = 1,  1,  2,  3,  5,  8,  13, ...
  test al, 1
  jz   isEven
  add  edx, eax     ; -> RDX  +1, +1,     +3, +5,     +13, ...
isEven:
  cmp  eax, 1000000
  jb   More

  test al, 1        ; Keep this one-time correction outside of the loop
  jz   Done
  sub  edx, eax     ; Take back last addition of an odd Fib
Done:
  xchg eax, edx     ; Final result in RAX

This could be an interesting alternative (untested):

  xor  eax, eax     ; RAX=0 is SumOfOdds
  lea  ecx, [rax+1] ; RCX=1 is Number2
  cdq               ; RDX=0 is Number1 (Fibonacci)

SumIt:
  add  eax, edx     ; -> RAX  +1, +1,     +3, +5,     +13, ...
More:
  xchg edx, ecx     ; -> RCX = 0,  1,  1,  2,  3,  5,   8, ...
  add  edx, ecx     ; -> RDX = 1,  1,  2,  3,  5,  8,  13, ...
  test dl, 1
  jz   More
  cmp  edx, 1000000
  jb   SumIt

                    ; Final result in RAX
Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • 1
    `mov eax, edx` at the end is more efficient than `xchg`. This isn't code-golf, we're not usually trying to optimize for minimal code-size. So actually, use EAX for SumOfOdds the whole time, avoiding the need for any `mov` instruction. (If you're counting bytes, that has a net cost cost of 2 extra byte of code size: dropping the final `xchg` saves 1, but it costs 1 byte each for `xchg ecx, edx` instead of EAX, and for `test dl, 1` instead of AL, and for `cmp edx, 1000000` instead of EAX. Actually we can optimize `xchg`+`add` into 3-byte `xadd ecx, edx`, saving 1 byte if EAX isn't involved.) – Peter Cordes Apr 09 '23 at 22:14
  • Or unroll by 3 and save on the looping logic, like `add ecx, edx` / `add edx, ecx` / `xadd edx, ecx` I think works, leaving the highest Fibonacci number in EDX, and the one before in ECX, ready for another iteration (https://www.felixcloutier.com/x86/xadd). Two `add` instructions can grab those results. – Peter Cordes Apr 09 '23 at 22:16
  • Since x86-64 guarantees SSE2, the fun way would be to unroll by 3x or 6x and use one extra `paddq` per 3x `paddq` to accumulate both odd numbers, with the SIMD elements being Fib(n) and Fib(n+1) as in [Assembly Language (x86): How to create a loop to calculate Fibonacci sequence](https://stackoverflow.com/a/32661389) (You might need a counted loop, though, since checking the Fib(n) values would require an extra `movd` defeating the savings vs. scalar.) For the low iteration counts it takes to overflow a 32-bit or even 64-bit integer, not a big speedup. – Peter Cordes Apr 10 '23 at 00:57