3

I'm attempting to write a program that will generate a text file with every possible permutation of the alphabet from one character up to twenty-nine characters. I've chosen 29 as the longest English word that everyone knows is antidisestablishmentarianism which is 28 characters in length. There are longer ones, but they are mainly very technical and obscure.

I realise this will generate a huge number of strings. However I've no idea where to start or even how to figure out how many combinations this will generate.

Answers please for solutions in PHP, Processing, C++ or Java (I'm only familiar with those, PHP is preferred, but probably no the best for this I should imagine).

Or even just pseudo-code / ideas will be appreciated.

Also, before someone says it, this isn't for brute forcing or anything like that. I'm an artist, albeit somewhat unknown and obscure with my concepts.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
logic-unit
  • 4,195
  • 12
  • 47
  • 72
  • 7
    I hope you understand just how big this text file would be. Better purchase a few more hard disks. – EboMike Jan 05 '11 at 23:05
  • If you just do every possible permutation of 29-character words, that would be 26 to the power of 29. That's about 1e41. – EboMike Jan 05 '11 at 23:06
  • Yeah that's on my to do list. Sorry, excuse my bad maths, what's that as a written number? – logic-unit Jan 05 '11 at 23:06
  • 1
    The combined storage space of our entire planet could not store that list. – thirtydot Jan 05 '11 at 23:07
  • 3
    Oh. One for the future then I guess. – logic-unit Jan 05 '11 at 23:09
  • Okay, maybe I wasn't clear. Do you have a 1TB hard drive? 1e41 bytes is 100000000000000000000000000000TB. Actually, I was wrong - 1e41 is just the number of ''combinations''. You'd have to multiply that by 30. Bah, what's another zero? – EboMike Jan 05 '11 at 23:10
  • Ok, storage space aside, how would the program work. For say up to 10 characters? – logic-unit Jan 05 '11 at 23:10
  • That's a lot of terabytes. Probably why no one has done it then I guess. – logic-unit Jan 05 '11 at 23:11
  • Maybe I read your question wrong - what is it you are trying to do? Shuffle the letters around, as done in my answer? Or perform a monoalphabetic substitution on your word for every possible substitution as in http://en.wikipedia.org/wiki/Substitution_cipher#Simple_substitution? – etarion Jan 05 '11 at 23:13
  • 1
    Are we talking about *permutations* of a single word or *combinations* of the letters of the alphabet? From the question it isn't clear to me. – Matteo Italia Jan 05 '11 at 23:16
  • Hi etarion, thanks for your answer. I'm trying to create every unique combination of A-Z from one letter, up to twenty-nine letters. Basically, every word that could, or does exist, up to twenty letters. – logic-unit Jan 05 '11 at 23:18
  • As art, I do not personally see the point of having a list of random characters so large that no one can ever see it all. – brian_d Jan 05 '11 at 23:19
  • @Matteo Italia i think it might be combinations actually. – logic-unit Jan 05 '11 at 23:20
  • @brian_d well it's a subjective field. i'm not really debating the concept, just asking how it can be done. – logic-unit Jan 05 '11 at 23:21
  • 1
    @logic-unit: well, so they are *combinations*; they should be 26+26^2+26^3+...+26^29=26^30-1 combinations (IIRC). – Matteo Italia Jan 05 '11 at 23:21
  • @logic-unit: have a look at this answer of mine: http://stackoverflow.com/questions/1991361/find-all-combinations-of-a-given-set-of-numbers/1991542#1991542. Keep in mind that obviously that code would happily run for hundreds of years before getting your work done. – Matteo Italia Jan 05 '11 at 23:24
  • 1
    Ok, given that on my machine getting 26^6 combinations (discarding the IO) is done in 9.48 s, doing all the 26^30 (-1) should take 26^30/26^6*9.48=26^24*9.48=86331381095212797790462237062180372,48 s, which are about 2,7*10^27 years. I hope you have a good UPS to keep your PC running when the Sun will be completely defunct. – Matteo Italia Jan 05 '11 at 23:31
  • @Matteo Italia thanks, that's looks really interesting. Not sure if I've got hundreds of years though. Might have to be a family heirloom. – logic-unit Jan 05 '11 at 23:31
  • 2
    Since this is an art project, maybe the result doesn't need to be stored. Perhaps it could flash an LED pointed at the Large Magellanic Cloud or something? – Tony Park Jan 05 '11 at 23:32
  • @Tony Park thanks yeah, i was just trying to think of alternatives, since my idea for a small leather bound book has been blown out of the water. – logic-unit Jan 05 '11 at 23:34
  • Actually at some time I thought about making a web site that provided all the possible BW images of a certain size (generating them with the timestamp of the request), maybe letting users save their favorite ones. I think that such a thing would be more fun, since a nonsense word tells nothing, while in a strange image you can always find some strange meaning. – Matteo Italia Jan 05 '11 at 23:34
  • @Matteo Italia I'd not thought of using images. Although a would word without meaning has infinite potential. – logic-unit Jan 05 '11 at 23:36
  • @logic-unit: IMO it would be nicer and more evocative. By the way, are you trying to achieve something like this? http://en.wikipedia.org/wiki/The_Nine_Billion_Names_of_God :D – Matteo Italia Jan 05 '11 at 23:36
  • @Matteo Italia thanks for the link i'll have to read that. I guess, in a way yes, though I don't have any theistic inclinations. Yes I agree about the images, probably worth exploring both. In fact there's probably a lot of variations on this idea. – logic-unit Jan 05 '11 at 23:40
  • @logic-unit; what if you print the first million strings in your book, then an ascii art image of an explosion, followed by the algorithm? :-) – Tony Park Jan 05 '11 at 23:41
  • @logic-unit; from some perspectives, there is no difference between the output and the algorithm, that might be a fun idea for a programmer / artist to explore - and did I just say 'perspective'? Now I have an image of a large string receding into the distance – Tony Park Jan 05 '11 at 23:43
  • @Tony Park heh, nice idea. Or I could just print a 100 books of random combinations and hope no one checks. – logic-unit Jan 05 '11 at 23:43
  • Is there a way to generate just one combination in the sequence on request? So using the algorithm a person could browse any of all the combinations, without them all having to be generated and written to a file? – logic-unit Jan 05 '11 at 23:46
  • @logic-unit; but wouldn't it be cool if someone did check? – Tony Park Jan 05 '11 at 23:47
  • @logic-unit: an alternative would be to get a monkey with a typewriter and give it infinite time. Tell us when you got the complete works of Shakespeare. :) – Matteo Italia Jan 05 '11 at 23:47
  • @logic-unit: well, a combination of letters is just a fancy way of writing a number (you're writing it in base 26 instead of base 10); you can use the ordinary algorithms of base change to do that. But if you asked to the user the number of the combination to give him back the equivalent string, it would be a fraud, you would just be changing the base in which his number is expressed. :) – Matteo Italia Jan 05 '11 at 23:49
  • One combination? Easy. Basically take your top answer, and ignore the loop. But not in a book :-) – Tony Park Jan 05 '11 at 23:49
  • Oh yeah! Though the idea of a book seems better, a little more profound perhaps. This kind of problem will probably be easy in 20 years... – logic-unit Jan 05 '11 at 23:54
  • Thank you everyone for your input, I'll need to think about this one for a while. – logic-unit Jan 05 '11 at 23:55
  • @logic-unit just an afterthought. are you interested in *every single permutation* or every "word" that *could* be something meaningful now or in the future? for example, do your permutations need at least a vowel every x consonants, or could they just be some kind of acronym, nonsense word or encrypted code? – brian_d Jan 06 '11 at 14:42
  • @brian_d thanks for your reply - afterthoughts are always welcome. i hadn't gone that far into it yet. i think i'm looking to create combinations of letters that are or could be a word in english. i'm unsure of the pattern, or structure, that all english words follow - something i will have to look into. i should imagine this will bring the 26 to the power of 29 figure down somewhat? – logic-unit Jan 06 '11 at 21:26
  • @logic-unit I think it would certainly reduce the size of the result set, though the generator would still be large – brian_d Jan 06 '11 at 22:30

13 Answers13

4

The word "permutation" usually means that each letter appears exactly once, so it would be impossible to generate any permutation with more than 26 letters. Anyway, since the number of generated strings is too big, you can use random strings instead (the following is C code):

char s[30];
int p;
for (;;) // repeat forever: you cannot use a realistic iteration limit anyway
{
    for (p = 0; p < 29; ++p)
        s[p] = 'a' + rand() % 26;
    s[29] = '\0';
    puts(s);
}
anatolyg
  • 26,506
  • 9
  • 60
  • 134
  • Not very systematic, is it :) – EboMike Jan 05 '11 at 23:19
  • 2
    that is incorrect, you have given the definition for combination, a permutation is a construction using a combination, eg: the combination of letters a,b and c can construct the permuations abc or aabbcc or abbbbc etc... –  Jan 05 '11 at 23:51
  • Indeed, you are giving a definition of a combination, not permutation. In combination, abc and acb are the same. But in permutation, abc and acb are 2 different arrangement. – Lynnell Neri Dec 09 '15 at 04:51
3
void out_perms(std::string word) {
    std::vector<int> indexes(word.size());
    for (size_t i = 0; i < indexes.size(); ++i)
        indexes[i] = i;
    do {
        for (size_t i = 0; i < indexes.size(); ++i)
            std::cout << word[indexes[i]];
        std::cout << std::endl;
    } while (std::next_permutation(indexes.begin(), indexes.end()));
}

int main(int, char**) {
    out_perms("asdfg");
}

See http://codepad.org/6lQTPQrG for example output

etarion
  • 16,935
  • 4
  • 43
  • 66
  • I interpreted "every possible permutation of the alphabet" as "every possible combination of letters in the alphabet", i.e. from "a" over "aaaaa" through "zzzzz". – EboMike Jan 05 '11 at 23:17
3

Obviously, the outer for loop is for the number of characters in your word. Then, you just create strings with that length. For length 5, you start with "AAAAA", then "AAAAB", "AAAAC".

Once you hit "Z", you go back and move the character to your left one up, i.e. "AAAAZ" becomes "AAABA", and "AAAZZ" becomes "AABAA". Once you hit "ZZZZZ", you're done with the inner loop, and the outer loop will then start with "AAAAAA".

EboMike
  • 76,846
  • 14
  • 164
  • 167
  • 1
    I believe the concept is more easily understood as counting in base 26. The first "digit" goes from A to Z, next to AA, AB, ..., AZ then BA and so on. – Thomas Matthews Jan 05 '11 at 23:20
  • Yep, same thing as counting from 1 to 999999, except that you're using letters as digits. – EboMike Jan 05 '11 at 23:23
3

Here is a simple untested program in C++ that creates the words by counting in Base 26:

#include <string>
#include <iostream>

int main(void)
{
    //----------------------------------------------------------
    //  Print permuations of strings of letters up to length 5.
    //  Use base 26 arithmetic.
    //----------------------------------------------------------
    const unsigned int MAX_ITERATIONS = 26 * 26 * 26 * 26 * 26;

    std::string word = "A";
    for (unsigned int i = 0; i < MAX_ITERATIONS; ++i)
    {
        //------------------------------------------------------
        //  Print the word
        //------------------------------------------------------
        std::cout << word << std::endl;

        //------------------------------------------------------
        //  Increment the word, using base 26 arithmetic.
        //  A, B, C, ..., Z.
        //  AA, BA, CA, ..., ZA, AB, BB, CB, DB, ..., ZZ.
        //  AAA, BAA, CAA, ..., ZAA, ABA, BBA, CBA, DBA, ..., ZZZ.
        //------------------------------------------------------
        bool            carry_generated = false;
        unsigned int    digit = 0;
        do
        {
            carry_generated = false;
            if (word[digit] < 'Z')
            {
                ++word[digit];
                break;
            }
            word[digit++] = 'A';
            if (word.length() == digit)
            {
                word += "A";
                break;
            }
            carry_generated = true;
        } while (carry_generated && (digit < 5));
    }

    return 0;
}

The number of words printed can be reduced by checking a word list (a.k.a. dictionary) before printing. If the word is in the word list, print it.

The biggest issue with a word length of 29 is representing the quantity. The quantity overflows the range of the standard C++ unsigned integers. A Big Int library would need to be used. The next issue is the time required to process every combination. Multiply the quantity by 1 microsecond per iteration (a kind of worse case) and reduce down to days, hours, minutes and seconds. Perhaps years may be required.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • Thanks Thomas, that's a great answer. Yeah I hadn't thought of it exceeding the integer length. – logic-unit Jan 06 '11 at 00:00
  • 1
    My rough estimate is 8.9E+29 years, assuming 1 microsecond per iteration and a total of 26^30 (26 to power of 30) iterations (it should be less by one, but that is insignificant in a number this large). I could be wrong... – Thomas Matthews Jan 06 '11 at 00:05
2

Using PHP's Perl-style character incrementing.

set_time_limit(0);

$perm = 'A';
$endTest = str_repeat('Z',28).'A';
while ($perm != $endTest) {
    echo $perm++,"\n";
}

Run the script from the command line so you don't hit a webserver timeout; then sit back and wait a few years for it to complete

Mark Baker
  • 209,507
  • 32
  • 346
  • 385
2
function p($length, $partial)
{
      if ($length == 0) return $partial;
      $ans = array();
      foreach (range('a', 'z') as $i)
      {
          $ans[] = p($length -1, $partial . $i);
      }
      return $ans;  
}

$top = 3;
//$f = fopen('out.txt');
for ($l = 1; $l < $top+1; $l++)
{
     print_r(p($l), '');
     //fwrite($p($l), '');
}

If you want to set $top to 29 and give it a try go ahead. I'm not going to.

EDIT - print_r(p($l), ''); ---> print_r(p($l, ''));

PHP keeps impressing me with its tolerance for mistakes. Missing a 'required' argument to my p? no problem itll just be empty string somehow (or zero, or false, situation depending). Second '' argument to print_r? no difference, gets treated like the default false anyway

EDIT

I don't know what the hell I was doing here. The different return types of p are quite odd, and will return a compound array with a weird structure.

This is a far better solution anyway

$lengthDesired = 29;
for($i='a'; $i != str_pad('',$lengthDesired+1,'a'); $i++)
    echo $i .', ';
jon_darkstar
  • 16,398
  • 7
  • 29
  • 37
1
public class hii {  

public static void main(String[] args){

    String[] database = {"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"};

    for(int i=1; i<=database.length; i++){
        String[] result = getAllLists(database, i);
        for(int j=0; j<result.length; j++){
            System.out.println(result[j]);
        }
    }



}


    public static String[] getAllLists(String[] elements, int lengthOfList)
    {
        //initialize our returned list with the number of elements calculated above
        String[] allLists = new String[(int)Math.pow(elements.length, lengthOfList)];

        //lists of length 1 are just the original elements
        if(lengthOfList == 1) return elements; 
        else {
            //the recursion--get all lists of length 3, length 2, all the way up to 1
            String[] allSublists = getAllLists(elements, lengthOfList - 1);

            //append the sublists to each element
            int arrayIndex = 0;

            for(int i = 0; i < elements.length; i++){
                for(int j = 0; j < allSublists.length; j++){
                    //add the newly appended combination to the list
                    allLists[arrayIndex] = elements[i] + allSublists[j];
                    arrayIndex++;
                }
            }
            return allLists;
        }
    }






}
Uri Agassi
  • 36,848
  • 14
  • 76
  • 93
hacker
  • 342
  • 1
  • 4
  • 14
1

Here is a permutation generator written in java http://www.merriampark.com/perm.htm.

However as he mentions

  //-----------------------------------------------------------
  // Constructor. WARNING: Don't make n too large.
  // Recall that the number of permutations is n!
  // which can be very large, even when n is as small as 20 --
  // 20! = 2,432,902,008,176,640,000 and
  // 21! is too big to fit into a Java long, which is
  // why we use BigInteger instead.
  //----------------------------------------------------------

Since your n is 29, you will be waiting a long, long time. It is too large, as EboMike is trying to tell you in his comments.

brian_d
  • 11,190
  • 5
  • 47
  • 72
  • Yes I knew it was going to be big. Didn't quite realise how big. I'm sure it's possible, to display this information in some form. That may be the bigger challenge, rather than generating the combinations. – logic-unit Jan 06 '11 at 00:04
1

just off the top of my head (PHP).

$index = 0;

while(1) {
   $output_string = '';
   $base_26 = (string)base_convert($index, 10, 26);
   if (strlen($base_26) > 29) break;
   for ($i = 0; $i < strlen($base_26); $i++) {
      $output_string .= chr(65 + base_convert($base_26[$i], 26, 10));
   }
   $index++;
   echo $output_string;
}
dqhendricks
  • 19,030
  • 11
  • 50
  • 83
1

This is what I would do:

#include <iostream>

void printWords(std::string& word, int index, int last)
{
    std::cout << word << "\n";
    if (index != last)
    {
        for(char loop = 'a'; loop <= 'z'; ++loop)
        {
            word[index] = loop;
            printWords(word, index+1, last);
        }
        word[index] = ' ';
    }
}

int main()
{
    std::string word("                             "); // 29 space

    printWords(word,0,word.length());
}
Martin York
  • 257,169
  • 86
  • 333
  • 562
1

A Java solution that should do the trick:

public void characterPermutations(int length, LinkedList<String> permutations) {
    if(length > 1) {
        characterPermutations(length - 1, permutations);

        ListIterator<String> iterator = permutations.listIterator();
        while(iterator.hasNext()) {
            String permutation = iterator.next();
            for(char c = 'a'; c <= 'z'; c++) {
                iterator.add(c + permutation);
            }
        }

    } else {
        for(char c = 'a'; c <= 'z'; c++) {
            permutations.add(c + "");
        }
    }
}
0

The easiest way I can think of to get every permutation from 1 char to 29 chars in pseudo-code:

loop from i = 1 to 26^29 or 27^29 if you want to include spaces
{
   convert i to base 26 or 27;
   translate each number to the corresponding letter;
}
182764125216
  • 931
  • 14
  • 38
0

Even if you could store this on disk, like thirydot pointed out, you'd run out of time doing this.

Just generating (and not storing) all the 6-letter possiblilies took 24 seconds on my computer:

$ time perl letters.pl

real    0m24.837s
user    0m24.765s
sys     0m0.030s

Which is 7.7X10^-8s per word, That means it would take 8.4x10^33s or 2.6x10^26 years to do this.

You need to think more about your algorithm.

miked
  • 3,458
  • 1
  • 22
  • 25