3

I've tried to compute the binomial coefficient by making a recursion with Pascal's triangle. It works great for small numbers, but 20 up is either really slow or doesn't work at all.

I've tried to look up some optimization techniques, such as "chaching" but they don't really seem to be well integrated in C++.

Here's the code if that helps you.

int binom(const int n, const int k)
{
    double sum;

    if(n == 0 || k == 0){
            sum = 1;
    }
    else{
    sum = binom(n-1,k-1)+binom(n-1,k);
    }

    if((n== 1 && k== 0) || (n== 1 && k== 1))
       {
           sum = 1;
       }
    if(k > n)
    {
        sum = 0;
    }

    return sum;

}

int main()
{
int n;
int k;
int sum;

cout << "Enter a n: ";
cin >> n;
cout << "Enter a k: ";
cin >> k;

Summe = binom(n,k);

cout << endl << endl << "Number of possible combinations: " << sum << 
endl;

}

My guess is that the programm wastes a lot of time calculating results it has already calculated. It somehow must memorize past results.

Quotenbanane
  • 263
  • 1
  • 6
  • 16
  • Please give an example of sample input, expected output, and actual output. – dbush Mar 29 '19 at 16:38
  • your guess is correct and that’s exactly what caching tries to avoid. What have you tried to implement it? if you don’t have a clue, how could you think to store an retrieve values between successive calls? – Christophe Mar 29 '19 at 16:39
  • you can put in a sample yourself. Try 50 for n and 45 for k for instance. – Quotenbanane Mar 29 '19 at 16:39
  • Consider that this algorithm essentially adds a bunch of 1's, so it runs at least as long as the magnitude of the result.. not really ideal for functions with potentially large results – harold Mar 29 '19 at 16:40
  • Expected output for that should be 2118760, but i always kill of the programm because it never finishes – Quotenbanane Mar 29 '19 at 16:40
  • @Christophe I haven't tried because it seems to complicated. In Sage you've just to put "cached_function" in front of it. Seems more complicated in C++ – Quotenbanane Mar 29 '19 at 16:41
  • @harold sounds not good, yeah. – Quotenbanane Mar 29 '19 at 16:42
  • Here are some alternatives, many without memoization but just using better algorithms: https://stackoverflow.com/q/9330915/555045 – harold Mar 29 '19 at 16:46

5 Answers5

5

My guess is that the program wastes a lot of time calculating results it has already calculated.

That's definitely true.

On this topic, I'd suggest you have a look to Dynamic Programming Topic.

There is a class of problem which requires an exponential runtime complexity but they can be solved with Dynamic Programming Techniques. That'd reduce the runtime complexity to polynomial complexity (most of the times, at the expense of increasing space complexity).


The common approaches for dynamic programming are:

  • Top-Down (exploiting memoization and recursion).
  • Bottom-Up (iterative).

Following, my bottom-up solution (fast and compact):

int BinomialCoefficient(const int n, const int k) {
  std::vector<int> aSolutions(k);
  aSolutions[0] = n - k + 1;

  for (int i = 1; i < k; ++i) {
    aSolutions[i] = aSolutions[i - 1] * (n - k + 1 + i) / (i + 1);
  }

  return aSolutions[k - 1];
}

This algorithm has a runtime complexity O(k) and space complexity O(k). Indeed, this is a linear.

Moreover, this solution is simpler and faster than the recursive approach. It is very CPU cache-friendly.

Note also there is no dependency on n.

I have achieved this result exploiting simple math operations and obtaining the following formula:

(n, k) = (n - 1, k - 1) * n / k

Some math references on the Binomial Coeffient.


Note

The algorithm does not really need a space complexity of O(k). Indeed, the solution at i-th step depends only on (i-1)-th. Therefore, there is no need to store all intermediate solutions but just the one at the previous step. That would make the algorithm O(1) in terms of space complexity.

However, I would prefer keeping all intermediate solutions in solution code to better show the principle behind the Dynamic Programming methodology.

Here my repository with the optimized algorithm.

BiagioF
  • 9,368
  • 2
  • 26
  • 50
  • 3
    I believe this is a better answer than just presenting a program to solve the problem, because it says something about the underlying concepts which can help OP in other situations. – Robert Dodier Mar 29 '19 at 22:33
  • Perfect explanation, additional suggestions and an explanation of the math behind it. Just awesome, thanks! – Quotenbanane Mar 30 '19 at 21:19
  • the non-recursive code in the answer is missing the if (k < 0) return 0; if (k == 0) return 1; or at least the second half of it. – Erik Itter Sep 27 '22 at 01:51
1

You're computing some binomial values multiple times. A quick solution is memoization.

Untested:

int binom(int n, int k);

int binom_mem(int n, int k)
{
    static std::map<std::pair<int, int>, std::optional<int>> lookup_table;
    auto const input = std::pair{n,k};
    if (lookup_table[input].has_value() == false) {
        lookup_table[input] = binom(n, k);
    }
    return lookup_table[input];
}

int binom(int n, int k)
{
    double sum;

    if (n == 0 || k == 0){
        sum = 1;
    } else {
        sum = binom_mem(n-1,k-1) + binom_mem(n-1,k);
    }

    if ((n== 1 && k== 0) || (n== 1 && k== 1))
    {
        sum = 1;
    }
    if(k > n)
    {
        sum = 0;
    }

    return sum;
}

A better solution would be to turn the recursion tailrec (not easy with double recursions) or better yet, not use recursion at all ;)

YSC
  • 38,212
  • 9
  • 96
  • 149
1

I would cache the results of each calculation in a map. You can't make a map with a complex key, but you could turn the key into a string.

string key = string("") + n.to_s() + "," + k.to_s();

Then have a global map:

map<string, double> cachedValues;

You can then do a lookup with the key, and if found, return immediately. otherwise before your return, store to the map.

I began mapping out what would happen with a call to 4,5. It gets messy, with a LOT of calculations. Each level deeper results in 2^n lookups.

I don't know if your basic algorithm is correct, but if so, then I'd move this code to the top of the method:

if(k > n)
{
    return 0;
}

As it appears that if k > n, you always return 0, even for something like 6,100. I don't know if that's correct or not, however.

Joseph Larson
  • 8,530
  • 1
  • 19
  • 36
  • I've never tried your solution. My serialization works. For any particular problem, if you ask 4 programmers, you'll get 6 to 15 different solutions. I bet my serialization is easier to understand for a new student to programming. – Joseph Larson Mar 29 '19 at 16:54
  • I didn't meant to sound mean I'm sorry. This was just a reaction jest to a funny/sub-optimal solution. I've withdrawn my comment. – YSC Mar 29 '19 at 16:56
  • 1
    Naw, it's cool. It got me to look at your example. I'm actually fairly recently back to C++, having programmed Java for 20 years, and a lot has changed. I'm still getting used to some of it, such as what you did for the key of your map. – Joseph Larson Mar 29 '19 at 16:59
1

If you can tolerate wasting some compile time memory, you can pre-compute a Pascal-Triangle at compile time. With a simple lookup mechanism, this will give you maximum speed.

The downsite is that you can only calculate up to the 69th row. After that, even an unsigned long long would overflow.

So, we simply use a constexpr function and calculate the values for a Pascal triangle in a 2 dimensional compile-time constexpr std::array.

The nCr function simply uses an index into that array (into Pascals Triangle).

Please see the following example code:

#include <iostream>
#include <utility>
#include <array>
#include <iomanip>
#include <cmath>

// Biggest number for which nCR will work with a 64 bit variable: 69
constexpr size_t MaxN = 69u;
// If we store Pascal Triangle in a 2 dimensional array, the size will be that
constexpr size_t ArraySize = MaxN;

// This function will generate Pascals triangle stored in a 2 dimension std::array
constexpr auto calculatePascalTriangle() {

    // Result of function. Here we will store Pascals triangle as a 1 dimensional array
    std::array<std::array<unsigned long long, ArraySize>, ArraySize> pascalTriangle{};

    // Go through all rows and columns of Pascals triangle
    for (size_t row{}; row < MaxN; ++row) for (size_t col{}; col <= row; ++col) {

        // Border valus are always one
        unsigned long long result{ 1 };
        if (col != 0 && col != row) {

            // And calculate the new value for the current row
            result = pascalTriangle[row - 1][col - 1] + pascalTriangle[row - 1][col];
        }
        // Store new value
        pascalTriangle[row][col] = result;
    }
    // And return array as function result
    return pascalTriangle;
}
// This is a constexpr std::array<std::array<unsigned long long,ArraySize>, ArraySize> with the name PPP, conatining all nCr results 
constexpr auto PPP = calculatePascalTriangle();

// To calculate nCr, we used look up the value from the array
constexpr unsigned long long nCr(size_t n, size_t r) {
    return PPP[n][r];
}

// Some debug test driver code. Print Pascal triangle
int main() {
    constexpr size_t RowsToPrint = 16u;
    const size_t digits = static_cast<size_t>(std::ceil(std::log10(nCr(RowsToPrint, RowsToPrint / 2))));
    for (size_t row{}; row < RowsToPrint; ++row) {

        std::cout << std::string((RowsToPrint - row) * ((digits + 1) / 2), ' ');

        for (size_t col{}; col <= row; ++col)
            std::cout << std::setw(digits) << nCr(row, col) << ' ';
        std::cout << '\n';
    }
    return 0;
}

We can also store Pascals Triangle in a 1 dimensional constexpr std::array. But then we need to additionally calculate the Triangle numbers to find the start index for a row. But also this can be done completely at compile time.

Then the solution would look like this:

#include <iostream>
#include <utility>
#include <array>
#include <iomanip>
#include <cmath>

// Biggest number for which nCR will work with a 64 bit variable
constexpr size_t MaxN = 69u; //14226520737620288370
// If we store Pascal Triangle in an 1 dimensional array, the size will be that
constexpr size_t ArraySize = (MaxN + 1) * MaxN / 2;

// To get the offset of a row of a Pascals Triangle stored in an1 1 dimensional array
constexpr size_t getTriangleNumber(size_t row) {
    size_t sum{};
    for (size_t i = 1; i <= row; i++)   sum += i;
    return sum;
}

// Generate a std::array with n elements of a given type and a generator function
template <typename DataType, DataType(*generator)(size_t), size_t... ManyIndices>
constexpr auto generateArray(std::integer_sequence<size_t, ManyIndices...>) {
    return std::array<DataType, sizeof...(ManyIndices)>{ { generator(ManyIndices)... } };
}
// This is a std::arrax<size_t,MaxN> withe the Name TriangleNumber, containing triangle numbers for ip ti MaxN
constexpr auto TriangleNumber = generateArray<size_t, getTriangleNumber>(std::make_integer_sequence<size_t, MaxN>());

// This function will generate Pascals triangle stored in an 1 dimension std::array
constexpr auto calculatePascalTriangle() {

    // Result of function. Here we will store Pascals triangle as an 1 dimensional array
    std::array <unsigned long long, ArraySize> pascalTriangle{};

    size_t index{}; // Running index for storing values in the array

    // Go through all rows and columns of Pascals triangle
    for (size_t row{}; row < MaxN; ++row) for (size_t col{}; col <= row; ++col) {

        // Border valuse are always one
        unsigned long long result{ 1 };
        if (col != 0 && col != row) {

            // So, we are not at the border. Get the start index the upper 2 values 
            const size_t offsetOfRowAbove = TriangleNumber[row - 1] + col;
            // And calculate the new value for the current row
            result = pascalTriangle[offsetOfRowAbove] + pascalTriangle[offsetOfRowAbove - 1];
        }
        // Store new value
        pascalTriangle[index++] = result;
    }
    // And return array as function result
    return pascalTriangle;
}
// This is a constexpr std::array<unsigned long long,ArraySize> with the name PPP, conatining all nCr results 
constexpr auto PPP = calculatePascalTriangle();

// To calculate nCr, we used look up the value from the array
constexpr unsigned long long nCr(size_t n, size_t r) {
    return PPP[TriangleNumber[n] + r];
}


// Some debug test driver code. Print Pascal triangle
int main() {
    constexpr size_t RowsToPrint = 16; // MaxN - 1;
    const size_t digits = static_cast<size_t>(std::ceil(std::log10(nCr(RowsToPrint, RowsToPrint / 2))));
    for (size_t row{}; row < RowsToPrint; ++row) {

        std::cout << std::string((RowsToPrint - row+1) * ((digits+1) / 2), ' ');

        for (size_t col{}; col <= row; ++col)
            std::cout << std::setw(digits) << nCr(row, col) << ' ';
        std::cout << '\n';
    }
    return 0;
}
A M
  • 14,694
  • 5
  • 19
  • 44
1

I found this very simple (perhaps a bit slow) method of writing the binomial coefficient even for non integers, based on this proof (written by me):

double binomial_coefficient(float k, int a) {
    double b=1;
    for(int p=1; p<=a; p++) {
        b=b*(k+1-p)/p;
    }
    return b;
}
Gorka
  • 1,971
  • 1
  • 13
  • 28
Bin
  • 11
  • 1
  • Your function would be clearer if the parameter names matched the inputs in the derivation. Otherwise very nice work. – duffymo Jun 18 '21 at 13:06