I am currently implementing a code that runs Fibonacci sequence (up to the 60th number) in riscv32I, which means that the memory address that I can use only has 32bits.
I have first implemented the code in C, and then in Assembly, but I am curious if the algorithm I used has a name so I can do more research. The code is as such,
#include <stdint.h>
#include <stdio.h>
#include <inttypes.h>
int main() {
uint32_t n1 = 0; // first pre number (n - 2)
uint32_t n2 = 1; // second pre number (n - 1)
uint32_t add = 0; // current number
uint32_t store_hi = 0;
uint32_t store_lo = 0;
uint32_t result; // result
uint32_t carry; // carry bit
for (int i = 2; i < 61; i++) {
carry = 0; // reset carry bit
add = (uint32_t)(n2 + n1); // calculate current fib number
if (add < n1) { // if overflow
carry = 1; // set carry bit
}
result = store_hi + store_lo; // keeping track of higher bits
result = result + carry; // add carry bit
store_lo = store_hi; //
n1 = n2; // update first pre number
store_hi = result; //
n2 = add; // update second pre number
}
printf("Result32: 0x%08" PRIx32 " 0x%08" PRIx32 "\n", result, add);
uint64_t result64 = ((uint64_t)result << 32) | add;
printf("Result64: 0x%016" PRIx64 " -> %" PRId64 "\n", result64, result64);
}
Running the code gives
Result32: 0x00000168 0x6c8312d0
Result64: 0x000001686c8312d0 -> 1548008755920
The basic concept is that because the Fibonacci number gets too big to fit within a single 32bit memory address, we have to split it into 32bit memory address, one holding the upper bit, and one holding the lower bit.
Let's generalize the above algorithm to a 4 bit memory space, to make it easier to follow the algorithm. This means that the maximum int can be 16. Let ss set n1 = 10, n2 = 10.
Loop 1:
add = 4 # (10 + 10 = 20, but overflow, so 20 % 16 = 4)
carry = 1
result = 1
store_lo = 0
store_hi = 1
n1 = 10
n2 = 4
# output: 0x14, 0x1 hi bit, 0x4 lo bit, which is 10 + 10 = 20
Loop 2:
add = 14
carry = 0
result = 1
store_lo = 1
store_hi = 1
n1 = 4
n2 = 14
# output: 0x1e, 0x1 hi bit, 0xe or 14, lo bit, which is 10 + 20 = 30
loop 3:
add = 2 (14 + 4 = 18, but overflow, so 18 % 16, 2)
carry = 1
result = 3
store_lo = 1
store_hi = 2
n1 = 14
n2 = 2
#output: 0x32, 0x3 hi bit, 0x2 low bit, which is 20 + 30 = 50
.... and so on.
This should work for any base, but I am curious what this algorithm is denoted as, or if it is simply related to modules and powers?
Thanks!