5

I'm having difficult time while trying to produce all possible strings of a given character set. Let S be set of symbols. I need to process all possible combinations of S in length n. For example if S={'a','b','+','-'} and n=4 the algorithm should process following sequences:

aaaa
aaab
abab
+aa-
// And all other sequences in the universe

Currently my algorithm is a non-efficient recursive algorithm described below. I have two question:

  1. Is there a faster algorithm?
  2. Is there a parallel algorithm to solve this problem?

Current implementation: (Simplified)

void workhorse(vector<char> &input, vector<char>::iterator i)
{
    if(i==inputs.end()) {
        // process the input
        return;
    }
    else {
        for( const auto& symbol : S) {
            *i=symbol;
            workhorse(input, i+1);
        }
    }
}   
Niklas B.
  • 92,950
  • 18
  • 194
  • 224
sorush-r
  • 10,490
  • 17
  • 89
  • 173
  • 1
    I think this is already pretty efficient. If you want parallelism, you can just "cut" the tree from a certain level on. Something like `if (i == input.begin()+3) { vector cpy(input); async(Foo::workhorse, cpy, cpy+i-input.begin()); return; }` (add a flag to not run into an infinite loop) – Niklas B. Mar 22 '14 at 20:22
  • You can use openmp to make it parallel – Gilad Mar 22 '14 at 20:23
  • @Gilad But not directly. You have to be careful when you need to copy the vector – Niklas B. Mar 22 '14 at 20:27

3 Answers3

2

Do you need all permutations or combinations? From your example it seems like combinations (order doesn't matter, symbols can be repeated) but in your code it looks like you may be trying to generate permutations (guessing by the function name). Combinations is a simple matter of counting in base n - in this case 4^4 and permutations would be the much fewer 4! but slightly more complicated recursive algorithm depending if you want to preserve lexicographic order. Either way the algorithms are among the basic pillars of computer science and well covered already, try these other Q's:

Generating all Possible Combinations

Generate list of all possible permutations of a string

Community
  • 1
  • 1
gordy
  • 9,360
  • 1
  • 31
  • 43
2

Your algorithm already looks pretty efficient, you're not wasting any work. The only thing that could probably be improved slightly is the function call overhead due to recursion. But recursion is good, because it allows for easy parallelization:

#include <thread>
#include <array>
#include <string>
#include <vector>
using namespace std;

array<char,3> S = {{ 'a', 'b', 'c' }};
const int split_depth = 2;
void workhorse(string& result, int i) {
    if (i == result.size()) {
        // process the input
        return;
    }
    if (i == split_depth) {
        vector<thread> threads;
        for (char symbol : S) {
            result[i] = symbol;
            threads.emplace_back([=] {
                string cpy(result);
                workhorse(cpy, i + 1);
            });
        }
        for (thread& t: threads) t.join();
    } else {
        for (char symbol : S) {
            result[i] = symbol;
            workhorse(result, i + 1);
        }
    }
}

int main() {
    string res(6, 0);
    workhorse(res, 0);
}

Make sure to compile it with C++11 features and threading enabled, for example

$ g++ -O3 -std=c++11 -lpthread [file].cpp

This version of the function will enumerate all prefixes of up to length split_depth sequentially and then spawn a thread to further process each of those. So it will start |S|^split_depth threads in total, which you can adapt to match your hardware concurrency.

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
1

You could just do it iteratively. But it won't be much faster.

Imagine that characters in the set are digits.

'a' = 0, 'b' = 1, '+' = 2, '-' = 3

you start from 0000 and then increment it until you reach 3333.

0000, 0001, 0002, 0003, 0010, 0011, and so on...

This can be easily parallelised. For two threads let the first one do the job from 0000 to 1333 and the other from 2000 to 3333. Obviously this can be easily extended to an arbitrary number of threads.

There is nothing more to do. If your program is slow that's because there are so many combinations to choose. The time needed to find all the combinations by this code linearly depends on the number of combinations that there are.

ciamej
  • 6,918
  • 2
  • 29
  • 39
  • Your algorithm is exactly the same as OPs, just iterative instead of recursive... – Niklas B. Mar 22 '14 at 20:29
  • 1
    @NiklasB. To be honest i don't really get the OPs code. But it does something with permutations, so it doesn't seem obvious to me that OPs code is the same as mine... – ciamej Mar 22 '14 at 20:32
  • It does exactly this. And really, it's not that much code to read... Your suggestion to split the search range is sound though, it's just not really trivial to partition it evenly into *t* pieces where *t* is not a power of *|S|* (because you need biginteger arithmetics for that) – Niklas B. Mar 22 '14 at 20:36
  • @NiklasB. I didn't get deep into it. Now I see that the `permutation` function should be `Foo::workhorse` right? If so, now I understand the code and can tell it's pretty much the same ;) – ciamej Mar 22 '14 at 20:38