Generate a recursive series faster (similar to Fibonacci)

Question

Hi have to calculate the n term of a series, being n really big, and it has to be done as fast as posible.

The series is defined by the following function:

f(0) = 1
f(1) = 1
f(2n) = f(n)
f(2n+1) = f(n) + f(n-1)

I know I have to use memoization. I did this code but the problem is that with big n values is giving segmentation fault. I would like to try to do a 2 value array version (like the one described in one of the responses here), but I'm not reaching the proper solution.

uint64_t f(uint64_t n)
{
    uint64_t a[n+2];
    uint64_t i;

    a[0] = 1;
    a[1] = 1;
 
    for(i = 2; i <= n; i++)
    {
        if (i % 2 == 0)
        { 
            a[i] =  a[i / 2];
        }
        else
        {
            a[i] =  a[i / 2] + a[(i / 2) - 1];
        }
    }
    return a[n];
};

*Variable-Length Array* like `uint64_t a[n+2];` is not in the standard C++. — MikeCAT, May 14 '21 at 13:33
I tried with unsigned long long too. Segmentation faults are happening with values bigger than 100000 or so. — Alex P, May 14 '21 at 13:43
allocating `uint64_t a[100000+2];` will exceed the default stack size - move the array to the heap. — Richard Critten, May 14 '21 at 14:56

Eike · Accepted Answer · 2021-05-14T17:00:51.127

2

I think recursion is the right idea here, because you only need the last value and it actually doesn't depend on that many prior values. The best case is, if your input is a power of 2, say 2^n. Then you only need the values of n inputs. Although the performance is worse in other cases, it should still be much better than actually computing ALL preceding values.

EDIT: With the specific number requested in the comments and a counter variable to show the number of evaluations needed:

EDIT2: Of course it is possible (and a real performance boost!) to combine the recursion with caching intermediate results, as @Bob__ demonstrated in his comment below, thanks! For future readers here the full version with recursion+caching. With the given input, caching cuts down the number of needed evaluations of g dramatically from 12875760616 without caching to just 164 with caching.

#include <iostream>
#include <map>
#include <stdexcept>

static unsigned long count = 0;
static std::map<unsigned long, unsigned long> cache;

static unsigned long g(unsigned long n) {
  auto it = cache.find(n);
  if (it != cache.end()) {
    return it->second;
  }

  ++count;
  if (count == 0) {
    std::cout << "Integer overflow! Aborting!";
    throw std::overflow_error("overflow error!");
  }

  if (n == 0 or n == 1) {
    return 1;
  } else if (n % 2 == 0) {
    auto a = g(n / 2);
    cache.insert({n, a});
    return a;
  } else {
    auto a = g((n - 1) / 2) + g((n - 3) / 2);
    cache.insert({n, a});
    return a;
  }
}

int main() {
  try {
    std::cout << "result: " << g(123456789012345678) << std::endl;
    std::cout << "number of used values: " << count << std::endl;
  } catch (std::exception &e) {
    std::cout << "An error occured:\n" << e.what() << std::endl;
  }
}

edited May 14 '21 at 17:00

answered May 14 '21 at 13:55

Eike

359
3
8

Thanks for the help Eike. However, I think the recursive solution is not the most eficient one. – Alex P May 14 '21 at 14:07
Depends. Do you only need the last value, like your function returns or do you need all values from 1 to n? In the first case I'm reasonably sure, that recursion is the best solution, because of the considerations at the start of my answer. – Eike May 14 '21 at 14:10
@A.Palomo Do you *think* ;) or have you measured it? How big `n` can be in your test cases and how many tests do you have to perform? – Bob__ May 14 '21 at 14:10
Well there may be a faster solution, which is to compute in advance, which numbers < n you need to evaluate and then cleverly only compute those in a manner that the compiler can optimize better than the recursion. – Eike May 14 '21 at 14:14
Thanks all. I have to give a unique result, for n=123456789012345678 – Alex P May 14 '21 at 14:16
Added a counter variable to show, how often function g had to be called. In case this answer is good for you, please mark it accepted, I'm still hunting for some rep so I can comment on questions. – Eike May 14 '21 at 14:25
1

Hi Eike. Finally, and thanks to your help, I think I will stick to the recursive version. I was convinced this was the most naive and slow version, and I had to use dinamic programming or memoization. But the goal is to produce a solution that works in the millisecond regime, and the recursive option does it. Thanks again for your help. – Alex P May 14 '21 at 15:07
1

@A.Palomo Please note that memoization have a huge impact (see e.g. https://godbolt.org/z/TxY1ahaKc), especially in these type of recursive algorithms were statements like `return g(...) + g(...)` force the program to recalculate many values. – Bob__ May 14 '21 at 16:28
oh yes, one should combine the recursion with a lookup table. Silly me only thought of it afterwards, thanks for pointing it out, @Bob__ ! – Eike May 14 '21 at 17:02
Thanks for your time Eike. Your solution now looks awesome. – Alex P May 15 '21 at 14:29

Bob__ · Answer 2 · 2021-05-16T10:19:13.903

We can expand the posted recurrence relations considering two sets of four values at a time.

$\colorbox{white}{\begin{bmatrix}F_{2n}\F_{2n+1}\F_{2n+2}\F_{2n+3}\end{bmatrix} = \begin{bmatrix}F_{n}\F_{n-1} + F_n\F_{n+1}\F_n+F_{n+1}\end{bmatrix} = \begin{cases}\begin{bmatrix}0 & 1 & 0 & 0\ 1 & 1 & 0 & 0\ 0 & 0 & 1 & 0\ 0 & 1 & 1 & 0\end{bmatrix}\begin{bmatrix}F_{n-1}\F_{n}\F_{n+1}\ {\textcolor{gray}{F_{n+2}}}\end{bmatrix} & n = 2k + 1\&\\begin{bmatrix}0 & 0 & 1 & 0\ 0 & 1 & 1 & 0\ 0 & 0 & 0 & 1\ 0 & 0 & 1 & 1\end{bmatrix}\begin{bmatrix}{\textcolor{gray}{F_{n-2}}}\F_{n-1}\F_{n}\F_{n+1}\end{bmatrix} & n = 2k\end{cases}}$

Note that the first element of those sets is always in an even position (2n, n-1 when n = 2k+1 or n-2 when n = 2k) and only three of the elements of each set in the lower positions actually participate in the calculation.

Starting from a given N, for which we want to calculate F_N, we can find a sequence of decreasing values n_i representing those positions and p_i indicating which transformation to use.

n₀ = N - N mod 4

h_i = n_i / 2
p_i = h_i mod 2
n_i+1 = h_i + p_i - 2

Then we just need to consecutively apply the previous transformations to the vector {F₀, F₁, F₂, F₃} = {1, 1, 1, 2} and we can find F_N in O(logN).

The following is a possible implementation

unsigned long long f(unsigned long long n)
{
    unsigned long long F[] = {
        1, 1, 1, 2
    };
    auto idx = n % 4;
    auto pos = n - idx;
    unsigned long long journal{};
    int count{};
    while ( pos )
    {
        auto n = pos / 2;
        journal <<= 1;
        journal |= n % 2;
        pos = n + n % 2 - 2;
        ++count;
    }
    while ( count-- )
    {
        if ( journal & 1u )
        {
            unsigned long long tmp = F[1];
            F[1] += F[0];
            F[0] = tmp;
            F[3] = F[2] + tmp;
        }
        else
        {
            unsigned long long tmp = F[2];
            F[0] = tmp;
            F[1] += tmp;
            F[2] = F[3];
            F[3] += tmp;
        }
        journal >>= 1;
    }
    return F[idx];
}

Testable here.

Very cool! You did much more theory on this. Do you think one could even create an explicit formula? My first impression was "no", but these matrix-products at least look a bit more like the explicit formula derivation of the Fibonacci sequence. — Eike, May 20 '21 at 10:44
@Eike Well, that's beyond my knowledged. Sure, I was thinking about the matrix representation of the Fibonacci sequence, but in that case there is *one* matrix, pretty easily exponantiable and with a couple of nice eigenvalues... — Bob__, May 20 '21 at 11:23

Generate a recursive series faster (similar to Fibonacci)

2 Answers2