how to solve possible unique combinations problem

Question

I need a combination algorithm for large numbers. I found something on StackOverflow but the implementation was not totally correct. The code below runs wrong if the size of the vector is larger than 22-24 and k is higher.

#include <bits/stdc++.h>

using namespace std;

template<typename T>
void pretty_print(const T& c, int combo)
{
    int n = c.size();
    for (int i = 0; i < n; ++i) {
        if ((combo >> i) & 1)
            cout << c[i] << ' ';
    }
    cout << endl;
}

template<typename T>
void combo(const T& c, int k)
{
    int n = c.size();
    int combo = (1 << k) - 1;       // k bit sets
    while (combo < 1<<n) {

        pretty_print(c, combo);

        int x = combo & -combo;
        int y = combo + x;
        int z = (combo & ~y);
        combo = z / x;
        combo >>= 1;
        combo |= y;
    }
}

int main()
{
    vector<char> c0 = {'1', '2', '3', '4', '5'};
    combo(c0, 3);

    vector<char> c1 = {'a', 'b', 'c', 'd', 'e', 'f', 'g'};
    combo(c1, 4);
    return 0;
}

This is taken from Creating all possible k combinations of n items in C++

Now, I'm using std::prev_permutation, it works but too slow for my analysis program. There are more than one thousand of combinations in my program. So that I wanted to use the algorithm above. How can ı fix this algorithm to work under all circumstances? Thank you, in advance.

For example there is a vector of {30,30,30,30,30,30,30,30,60,60,60,60,60,60,60,60,90,90,90,90,90,90,90,90}, when you try to find combination of this vector by 8, it does not do it correctly. I also think that bit calculation may be wrong, but I am not good at bitwise operations. I dont understand the while loop exactly — Bilal Can, Jun 25 '20 at 08:41
Can you exactly describe what a _"combination algorithm for large numbers"_ does? What does it mean to _"find combination of this vector by 8"_? — Thomas Sablik, Jun 25 '20 at 10:52
@ThomasSablik You can refer to https://en.wikipedia.org/wiki/Combination as what a combination do. — Ranoiaetep, Jun 25 '20 at 11:58
@ThomasSablik for example it generates for 8: (30,30,30,30,30,30,30,30), (30,30,30,30,30,30,30,60), (30,30,30,30,30,30,30,90), (30,30,30,30,30,30,60,60), (30,30,30,30,30,30,60,90), ........... , (60,90,90,90,90,90,90,90), (90,90,90,90,90,90,90,90) — Bilal Can, Jun 25 '20 at 12:34
Where do the numbers come from? Why do you use a vector of 24 elements instead of a map with structure {30: 8, 60: 8, 90: 8}? Can you show approach with `std::prev_permutation`? Why is it slower? — Thomas Sablik, Jun 25 '20 at 12:39
As far as I understand you want to get all distinct `k`-element subsets of an `n`-element set. One way is to use a primitive type like int and use bitshifts. Another approach is to use bitsets. But with both approaches you have to avoid duplicates. Another approach is restructure your elements into unordered maps `{30: 8, 60: 8, 90: 8}` and store the number of elements you took in a vector: `{0, 0, 8}`. You create all permutations with these elements. Then you increment the vector to `{0, 1, 7}` and create all permutations. At the end you will get all possible combinations. No duplicates. — Thomas Sablik, Jun 25 '20 at 13:22
@ThomasSablik hmm the last approach you said can be good. I will think it to code. Thanks — Bilal Can, Jun 25 '20 at 13:23

Ranoiaetep · Answer 1 · 2020-06-25T15:06:27.333

The reason it would fail is because the original algorithm you were posting was utilizing some bit wise calculation. It is actually consider each bit of a int type as a different number. And depending on the int defined by your compiler, you might be limited to a maximum n of either 32 or 64. You could also change all the int declaration to int64_t to force them to be defined in 64 bit.

However, doing that would still capped your maximum n to 64. (In fact, it is capped to the bit size - 2 or 62 or 30, so not entirely sure why you got stuck around 23) The real solution is to change all the int to std::bitset so that it could store a maximum n of SIZE_MAX.

Code as below:

#include <iostream>
#include <vector>
#include <bitset>

template<size_t bitSize>
void pretty_print(const std::vector<std::bitset<bitSize>>& c, std::bitset<bitSize> combo)
{
    int n = c.size();
    for (int i = 0; i < n; ++i) {
        if (((combo >> i) & std::bitset < bitSize>(1)) != std::bitset <bitSize>(0))
            std::cout << c[i].to_ullong() << ' ';
    }
    std::cout << std::endl;
}


template<size_t bitSize>
bool smallerThan(std::bitset<bitSize> bits1, std::bitset<bitSize> bits2)
{
    for (size_t i = bitSize - 1; i > 0; i--)
    {
        if (bits1[i] < bits2[i])
        {
            return true;
        }
    }
    return false;
}


template<size_t bitSize>
std::bitset<bitSize> bitAddition(std::bitset<bitSize> bits1, std::bitset<bitSize> bits2)
{
    std::bitset<bitSize> carry;
    while (bits2 != std::bitset<bitSize>(0))
    {
        carry = bits1 & bits2;
        bits1 = bits1 ^ bits2;
        bits2 = carry << 1;
    }
    return bits1;
}


template<size_t bitSize>
std::bitset<bitSize> bitSubtraction(std::bitset<bitSize> bits1, std::bitset<bitSize> bits2)
{
    while (bits2 != std::bitset<bitSize>(0))
    {
        std::bitset<bitSize> borrow = (~bits1) & bits2;
        bits1 = bits1 ^ bits2;
        bits2 = borrow << 1;
    }
    return bits1;
}



template<size_t bitSize>
std::bitset<bitSize> bitSubtractionStopAt0(std::bitset<bitSize> bits1, std::bitset<bitSize> bits2)
{
    while (bits2 != std::bitset<bitSize>(0))
    {
        std::bitset<bitSize> borrow = (~bits1) & bits2;
        bits1 = bits1 ^ bits2;
        bits2 = borrow << 1;
        if (bits1 == std::bitset<bitSize>(0)) return bits1;
    }
    return bits1;
}


template<size_t bitSize>
std::bitset<bitSize> bitDivision(std::bitset<bitSize> dividend, std::bitset<bitSize> divisor)
{
    std::bitset<bitSize> quotient(0);
    while (smallerThan(std::bitset<bitSize>(0), dividend))
    {
        dividend = bitSubtractionStopAt0(dividend, divisor);
        quotient = bitAddition(quotient, std::bitset<bitSize>(1));
    }
    return quotient;
}




template<size_t bitSize>
void combo(const std::vector<std::bitset<bitSize>>& c, int k)
{
    auto n = c.size();
    std::bitset<bitSize> one(1);
    std::bitset<bitSize> combo(bitSubtraction((one << k), std::bitset<bitSize>(1)));


    while (smallerThan(combo, (one << n)))
    {
        pretty_print(c, combo);
        auto negCombo = combo;
        negCombo.flip();
        for (size_t i = 0; i < bitSize; i++)
        {
            negCombo.flip(i);
            if (negCombo[i])
            {
                break;
            }
        }
        std::bitset<bitSize> x = combo & negCombo;
        std::bitset<bitSize> y;
        bool tempBit = 0;
        for (size_t i = 0; i < bitSize; i++)
        {
            y[i] = combo[i] ^ x[i];
            if (tempBit)
            {
                if (!y[i])
                {
                    tempBit = 0;
                }
                y[i] = y[i] ^ 1;
            }
            if (combo[i] & x[i])
            {
                tempBit = 1;
            }
        }
        std::bitset<bitSize> z = (combo & ~y);
        combo = bitDivision(z, x);
        combo >>= 1;
        combo |= y;
    }
}

int main()
{
    const int n = 500;
    int k = 2;
    std::bitset<(n + 2)> n_bits(n);
    std::vector<std::bitset<n + 2>> people;
    for (unsigned long i = 1; i < n_bits.to_ullong() + 1; ++i) { people.push_back(i); }

    combo(people, k);

    return 0;
}

I've only tested 90C4 and 500C2, but this should work for all n smaller than SIZE_MAX. Also I believe there are ways to optimize the bitwise calculations I've used better, not an expert on it.

Another approach is probably use a larger number type, such as int128_t or int1024_t. However, you would also need to overload some bitwise calculation.

Also you mentioned that:

For example there is a vector of {30,30,30,30,30,30,30,30,60,60,60,60,60,60,60,60,90,90,90,90,90,90,90,90}, when you try to find combination of this vector by 8, it does not do it correctly.

The way this algorithm work, it does not check the value of each members. Instead it automatically assume all members are unique. Similar to what a real nCr function would do.

Also some of the other method's mentioned in Creating all possible k combinations of n items in C++ are quite fast as well, especially since I can't find a good way of implementing bitwise calculation yet D: I guess you could have some data structure to separate the number as several bitset<63> object, so you could cast them to unsigned long long to use its bitwise operators. However not sure if that would run faster.

Either way if you are doing nCk ≈ thousands, then the difference for all those methods should be neglectable.

at first thanks, I am also using QT for the UI of the project, maybe there can be some data type for it in Qt. But, the programme will be 32-bit at the end of the deployment. By the way, I do not know who downed voted. Since I am new here, I cannot vote — Bilal Can, Jun 25 '20 at 12:01
@BilalCan I have updated my code, however they don't really work that fast because I'm not good at implementing bitwise calculations. Also those other methods in the link shouldn't be that slow if you are dealing with only nCk ≈ 5000, unless you were trying to find a solution for n ≈ 5000 — Ranoiaetep, Jun 25 '20 at 14:57
Your code produces duplicates: https://wandbox.org/permlink/yFrx3TAz36IGWObV OP asked for unique combinations. — Thomas Sablik, Jun 25 '20 at 15:10
@ThomasSablik You are right, but I was trying to solve the main problem, where he finds that code wouldn't run if `n` is greater than certain amount — Ranoiaetep, Jun 25 '20 at 15:15
@Ranoiaetep There is one problem about your code. Bitset requires compile time constant-expression, but the size of my vector actually changes in run-time. So that compiler gives and error. According to the inputs I should write const int n = degree_list_loop.size();, in your algortihm I cannot do it since it is run-time expression. — Bilal Can, Jun 26 '20 at 06:45
@BilalCan You are right, they behave like normal arrays for that aspect. If you want, you could define that to an arbitrary large number before you begins. — Ranoiaetep, Jun 27 '20 at 01:34
@Ranoiaetep it does not work when n is different from the size of the vector. Also when the vector has double values, it soe not work, too — Bilal Can, Jun 29 '20 at 10:33

how to solve possible unique combinations problem

1 Answers1