3

Imagine I have a bag of 26 scrabble tiles - one for each letter in the English alphabet.

My goal is to create an array of all possible strings up to n letters long. Say n=3.

Constraints:

  1. Letters must always be in alphabetical order (ABC, not CBA; combinations, not permutations)
  2. Strings must be <= n letters long (allow algorithm to break out of any loops at a given length)
  3. Letters may not repeat (A, not AA)

How can I most efficiently generate this array in PHP?

In other words, how can I avoid brute force looping through all possible combinations and filtering out those that don't match the rules above?


If my alphabet only contained 3 letters — $alphabet = range('a','c'); — I'd expect output of an array of 7 items (3C1+3C2+3C3): [A,B,C,AB,AC,BC,ABC].

If my alphabet only contained 4 letters — $alphabet = range('a','d'); — I'd expect output of an array of 15 items (4C1+4C2+4C3+4C4): [A,B,C,D,AB,AC,AD,BC,BD,CD,ABC,ABD,ACD,BCD,ABCD]. But if I wanted to limit to only strings <= 3 letters long, then I would ignore ABCD resulting in only 14 items (4C1+4C2+4C3).


$alphabet = range('a','z');

print_r(generate_strings($alphabet,1));
// expected output: A,B,C,...Z

print_r(generate_strings($alphabet,2));
// expected output: A..Z, AB..AZ, BC..BZ, CD, ..YZ

print_r(generate_strings($alphabet,3));
// expected output: A..Z, AB..YZ, ABC..XYZ

print_r(generate_strings($alphabet,10));
// expected output: A .. JKLMN .. AGKQRZ .. QRSTUVWXYZ
//                        ^         ^          ^10 character max, no repeats
//                        |         still alphabetical order
//                        alphabetical order

function generate_strings($alphabet,$max_word_length) {

    // how can I efficiently generate this array
    // without brute force looping through all of
    // the invalid and duplicate items like AA and CBA?

    return $array_of_all_possible_strings;
}
Ryan
  • 14,682
  • 32
  • 106
  • 179
  • Then what about [this question](http://stackoverflow.com/questions/12293870/algorithm-to-get-all-possible-string-combinations-from-array-up-to-certain-lengt?rq=1)? – Mark Baker Aug 01 '14 at 17:26
  • @MarkBaker That question deals with permutations (i.e. aaa is valid). This question is about combinations. I can brute force all permutations and then filter for combinations, but I'm looking to skip all invalid combinations the first time through. – Ryan Aug 01 '14 at 17:28
  • why not just brute force the permutations and don't add it to the final result array if the duplicates exist? For example, calculate the `AA` string, get a count of each character and if > 1 then don't add it to the output, essentially just checking for validity. At least you are not having to loop over a second time for validity checking. – Jonathan Kuhn Aug 01 '14 at 17:29
  • Either way I have to loop through many invalid options. I'd rather skip processing invalid options altogether. – Ryan Aug 01 '14 at 17:32
  • Mh, my first idea would be a recursive solution but I hope someone comes up with a smarter solution. –  Aug 01 '14 at 19:50

3 Answers3

2

I thought this looked like fun. Here's my attempt at it, for what it's worth:

function recurse($letters, &$words, $start, $end, $depth, $prefix = "") {
    $depth--;
    for ($i = $start; $i < $end; $i++) {
        $word = $prefix . $letters[$i];
        $words[] = $word;
        if ($depth) recurse($letters, $words, ++$start, $end, $depth, $word);
    }
}
function generate_strings($letters, $max_word_length) {
    $words = array();
    recurse($letters, $words, 0, count($letters), $max_word_length);
    return $words;      
}
Don't Panic
  • 41,125
  • 10
  • 61
  • 80
1

Don't know php, but the algorithm is clear:

  1. Sort the N letters (if not already) into an array.
  2. Maintain two dynamic arrays: the list of all combinations of length N or less (what you want), and the list of all strings of length N-1 or less.
  3. Loop backwards through the character array from the last character.
  4. Add the ith character to the list of all strings as a single-character string. Also add to the list of strings of length N-1 if N > 1.
    1. Now loop through the list of strings of length N-1 or less, going from the start of the array to the current end of the array.
    2. For each shorter string, create a new string by prepending the ith character.
    3. Add this new string to the list of all strings.
    4. And if of length N -1 or less, add to the list of shorter strings. (Careful with iterators here - you don't want to visit the string you just added in this inner loop.)
  5. Return the list of strings.
dbc
  • 104,963
  • 20
  • 228
  • 340
  • Thanks. I'm leaving this open a bit longer in the event someone wants to contribute a code solution, but I believe your algorithm is on the right track. – Ryan Aug 01 '14 at 23:44
0

I edited this similar post of mine today (since I realized that I originally posted an incorrect method) and while researching the topic, I found your question interesting.

Don'tPanic's recursive method seems to work as far I can tell. I just thought I'd post a non-recursive "stacking" method based on my linked method. I believe there is little performance difference between our solutions, maybe you'll find mine easier to read (maybe not).

Code: (Demo)

function permStacker($array,$length){
    $stack=[[]];  // declare intitial empty element
    for($x=0; $x<$length; ++$x){  // limit the total number of passes / max string length
        foreach($stack as $combination){
            foreach(array_diff($array,range('a',end($combination))) as $left){  // do not include iterate letter that come earlier than rightmost letter
                $merge=array_merge($combination,[$left]);
                $stack[implode($merge)]=$merge;  // keys hold desired strings, values hold subarray of combinations for iterated referencing
            }
        }
    }
    unset($stack[0]); // remove initial empty element
    return array_keys($stack);  // return array of permutations as space delimited strings
}

$permutations=permStacker(range('a','h'),4);
echo 'Total Permutations: ',sizeof($permutations),"\n";
var_export($permutations);

Output (truncated):

Total Permutations: 162
array (
  0 => 'a',
  1 => 'b',
  2 => 'c',
  3 => 'd',
  4 => 'e',
  5 => 'f',
  6 => 'g',
  7 => 'h',
  8 => 'ab',
  9 => 'ac',
  10 => 'ad',
  11 => 'ae',
  12 => 'af',
  13 => 'ag',
  14 => 'ah',
  15 => 'bc',
  16 => 'bd',
  17 => 'be',
  18 => 'bf',
  19 => 'bg',
  20 => 'bh',
  21 => 'cd',
  22 => 'ce',
  23 => 'cf',
  24 => 'cg',
  25 => 'ch',
  26 => 'de',
  ...
  126 => 'afgh',
  127 => 'bcde',
  128 => 'bcdf',
  129 => 'bcdg',
  130 => 'bcdh',
  131 => 'bcef',
  132 => 'bceg',
  133 => 'bceh',
  134 => 'bcfg',
  135 => 'bcfh',
  136 => 'bcgh',
  137 => 'bdef',
  138 => 'bdeg',
  139 => 'bdeh',
  140 => 'bdfg',
  141 => 'bdfh',
  142 => 'bdgh',
  143 => 'befg',
  144 => 'befh',
  145 => 'begh',
  146 => 'bfgh',
  147 => 'cdef',
  148 => 'cdeg',
  149 => 'cdeh',
  150 => 'cdfg',
  151 => 'cdfh',
  152 => 'cdgh',
  153 => 'cefg',
  154 => 'cefh',
  155 => 'cegh',
  156 => 'cfgh',
  157 => 'defg',
  158 => 'defh',
  159 => 'degh',
  160 => 'dfgh',
  161 => 'efgh',
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136