9

I'm trying to figure out if I can speed up the generation of permutations. Specifically I'm using 8 out of [a-z] and I'd like to use 8 out of [a-zA-Z] and 8 out of [a-zA-Z0-9]. Which I know will quickly take a great deal of time and space.

Even just a permutation of length 8 from the lowercase ASCII characters takes a while and generates gigabytes. My problem is that I don't understand the underlying algorithm and so I can't begin to figure out if I can split the problem up into smaller tasks than I can join together later.

A python script I was using to generate a list of permutations:

import string
import itertools
from itertools import permutations

comb = itertools.permutations(string.ascii_lowercase, 8)

f = open('8letters.txt', 'w')
for x in comb:
        y = ''.join(x)
        f.write(y + '\n')

f.close()

Does anyone know how to partition this up into subtasks and put them together later? Is it possible at all?

I might just try a (possibly) faster way of doing it, but I'm having trouble with C++ and its std::next_permutation(), so I can't verify that it could speed things up even a little just yet.

If I can partition it up into 16 tasks, and run on 16 Xeon CPUs, then join the results, that would be great.

1 Answers1

10

If it were just permutations with replacement, it would be very easy: Just parallelize on the first letter of the string and then let each thread add the tail of the string. This would give you 26 independent tasks. If that is not enough, you could parallelize on the first two letters.

You want to have a permutation without replacement, so the problem does not trivially factorize. If one only wants to pick 8 letters from a set of 26, 52 and 62, one could do a naive brute thing: Parallelize on the first letter, let the thread just create the tail with replacement and discard the generated strings that contain duplicates. But when you want to pick 25 from 26 letters, this becomes really wasteful.

With that idea in mind, we can do better! We parallelize on the first letter of the string and then generate all permutations with seven elements from the set excluding the letter that we started with. This way we can have 26 tasks (or 52 or 62) and still use the algorithm. This could look like this:

# Use whatever chars you want as the set.
chars = set(string.ascii_lowercase)

# We iterate through all the possible heads. This exact loop will be
# parallelized in the end.
for head in chars:
    # Exclude the head from the set.
    remaining_chars = chars - set(head)

    for tail in itertools.permutations(remaining_chars, 7):
        string = head + ''.join(tail)

        # Store the string in your list/file.

In order to make use of multiple cores, we use a pool. For this we first need a function to map over. This is just the above a bit refactored:

def make_strings(head):
    remaining_chars = chars - set(head)
    strings = [
        head + ''.join(tail)
        for tail in itertools.permutations(remaining_chars, 7)]

    return strings

Now somewhere else we can create a pool and let it map over heads:

with multiprocessing.Pool() as pool:
    all_strings = pool.map(make_strings, chars)

The pool only got the needed __enter__ and __exit__ properties with Python 3, so I assume that we use that.

Once that is finished the flatten the list of lists into a plain list of strings:

strings = [
    string
    for sub_list in all_strings
    for string in sub_list]

Since 26 is an odd number for 16 cores, we could think about creating the heads with itertools.permutation(remaining_chars, 2) and then using the set subtraction to generate the last 6 digits.


This is a complete working script for Python 3 that summarizes everything:

import itertools
import multiprocessing
import string


chars = set(string.ascii_lowercase)


def make_strings(head):
    remaining_chars = chars - set(head)
    strings = [
        head + ''.join(tail)
        for tail in itertools.permutations(remaining_chars, 3)]

    return strings


def main():
    with multiprocessing.Pool() as pool:
        all_strings = pool.map(make_strings, chars)

    strings = [
        string
        for sub_list in all_strings
        for string in sub_list]

    print(strings[:100])


if __name__ == '__main__':
    main()
Martin Ueding
  • 8,245
  • 6
  • 46
  • 92
  • Thank you! So I can use the first part of the code, replace for head in chars: ... with the make_strings function, then add with multiprocessing.ProcessPool ... letit finish then print out the strings in [ string for sub_list in all_strings for string in sub_list ] ? Why is 26 an odd number for 16 cores? I just plucked that number out; I have 24 cores and I can adjust that for the task :) Thank you!!! – James Young Nov 11 '18 at 15:41
  • I have added a complete program at the end such that you do not have to assemble the bits and pieces. This will only print out the first 100 elements, and I have tested it with a length of 4 strings. — Depending on what you do, it might not be the best idea to store them all in a file but rather do what you need to do with each string and not store them at all. – Martin Ueding Nov 11 '18 at 15:45
  • Thank you Martin again. Fantastic. I have been looking up an error though: File "permutations8.py", line 21, in main with multiprocessing.Pool() as pool: AttributeError: __exit__ Apparently it means multiprocessing.Pool() has no enter and exit methods: https://stackoverflow.com/questions/7447284/how-to-troubleshoot-an-attributeerror-exit-in-multiproccesing-in-python – James Young Nov 11 '18 at 16:11
  • Oh. Just switched to Python 3 and it worked. Thanks again Martin, absolutely fantastic. – James Young Nov 11 '18 at 16:43
  • @JamesYoung: I have now explicitly stated that it is Python 3. I had assumed that everything newly developed these days uses Python 3. – Martin Ueding Nov 12 '18 at 12:21
  • I should have made that assumption, yes. Thank you. I had both python 2 and 3 installed, and usually type python3. "Since 26 is an odd number for 16 cores, we could think about creating the heads with itertools.permutation(remaining_chars, 2) and then using the set subtraction to generate the last 6 digits." How would I calculate an optimal number of cores for this task? Does it depend on 26, 8, or both somehow? Cheers. – James Young Nov 12 '18 at 23:11
  • You need to have at least as many work packages as you have cores in your computer, otherwise some cores will be idle. Say we have 16 cores, then splitting by the first letter (26 possibilities), you have 16 cores happily working the letters A to P. These work packages have the exact same size and should finish simultaneously. But then there are only 10 letters left (Q to Z) for 16 cores. 10 cores will work, 6 will be idle. If one splits by first two letters, one will be able to have smaller packages which can be distributed more homogeneously. – Martin Ueding Nov 13 '18 at 10:16
  • Hi Martin, I'm still trying to figure out how I might do this, but not keep all the results in RAM. I would write them to file in make_strings instead of returning the result, is that right? So I could make full use of 10 cores if I were making permutations of string.digits (10 chars), with each work package on 10 possibilities. Then to do the rest, I would have: for tail in itertools.permutations(remaining_chars, 7)] in make_strings. Have I got roughly the right idea now, for permutations of length = 8, using the set of ASCII digits 0 - 9? Thanks again :) – James Young Nov 24 '18 at 03:05
  • Sounds about right. — I do not know what you want to do with all those permutations. Perhaps you can do whatever you want to do with each permutation right after it has been generated and only write things into a file that are useful results of your algorithms. Say you want to crack a password: You can apply your hash function right after generating permutation and only store something to disk when you have found the correct password. No need to generate them all into a file in that case. – Martin Ueding Nov 25 '18 at 19:44
  • Thank you again Martin, you've been very helpful. I'm struggling to understand the docs on Python parallel processing and I've had to Google for real examples that are often more complex than this one. But you are correct, this is close to what I am actually doing! Cheers. – James Young Nov 25 '18 at 21:12