4

I am trying to make an integer partition generator in which order matters and all parts are less than k (e.g. if k = 10 and n = 100, [5, 95] wouldn't be valid because 95 >= 10). I have already copied a generator (and modified it) where order does not matter from this website and am now trying to make a generator that goes through the permutations of each partition.

My main problem is creating a function that accepts a list (partition) and outputs the permutations of that list efficiently. I have already implemented an inefficient version of this by using itertools permutation, which I included:

set(itertools.permutations(partition))

This is very inefficient. When partition has something with duplicates, it iterates through all the duplicates. For example when partition = [1, 1, 1, 1, 1, 1, 1], itertools would loop through all 7! = 5040 partitions when there is only one unique permutation.

How can I make this more efficient? I will be considering values of n on the order of 10^10 and above, so efficiency is very important.

If it is helpful, note that all parts of a partition are less than k.

Edit: I think I (mostly) got it. There is a bug, though.

def helper(current, index, available, freqList):
    if index >= len(freqList) or available == []:
        yield current
        return
    else:
        if freqList[index] != 0:
            for combo in itertools.combinations(available, freqList[index]):
                yield helper(new(current, index, combo), index + 1, delete(available, combo), freqList)
        else:
            yield helper(current, index + 1, available, freqList)
def permutations(partition, k):
    #returns permutations of partition

    length = len(partition)
    freqList = [partition.count(i) for i in range(k)]

    available = [i for i in range(length)]
    return helper([0]*length, 1, available, freqList)

delete(L, T) returns a copy of L without the elements in T. new(L, i, T) returns a copy of L but with all indices in T replaced with i.

What this does is basically iterate through the combinations of where each number can go. Let's say there are 10 1s, 5 2s, and 7 3s in the partition. It iterates through the possible spaces the 1s can go. It then iterates through the spaces the 2s can go given that the 1s occupy other spaces. Finally, it iterates through the spaces the 3s can go given the 1s and 2s occupy their spaces.

There is an error in this code. For some reason permutations returns a generator of generators of generators... of lists nested m times, where m is the max elements in partition. For example if partition = [1, 1, 1, 3, 3, 3], I would have to do

for gen1 in permutations(partition, k):
    for gen2 in gen1:
        for gen3 in gen2:
            for partition in gen3:
                #something else

in order to access the partitions.

I don't know how to fix this, but think it is a problem in my yields.

Edit 2: Davis Herring mentioned in the comments that I could simply do yield from (helper(...) to fix the problem. I didn't know that I could do that and it worked. I am now interested in other solutions or optimizations for this one.

Varun Vejalla
  • 183
  • 1
  • 8
  • Why the downvote? I thought I explained my problem and my attempt well. – Varun Vejalla Oct 20 '19 at 04:30
  • Did you mean `yield from helper(…)`? – Davis Herring Oct 20 '19 at 06:36
  • Huh that worked - I didn't know that I could do that. Thanks for the help! – Varun Vejalla Oct 20 '19 at 07:02
  • I have not looked much at your implem but since you are copying stuff, maybe memory manipulation will slow down your algo. You may have a look at [multiset permutations](https://stackoverflow.com/questions/19676109/how-to-generate-all-the-permutations-of-a-multiset) in which, courtesy smichr has already provided an implem (which I have not checked either) – grodzi Oct 20 '19 at 08:23
  • Do you want a cap on the number of parts? Otherwise you'll have a list of 10 billion ones, then 10 billion lists with nearly all ones and a single 2, then about 50 quintillion lists with two 2s among all the 1s, then about 167 octillion lists with three 2s... Even processing these at the rate of a billion per second (a gigahertz), it will take 5 trillion years just to get through the lists with 3 twos and otherwise ones. – Nick Matteo Oct 24 '19 at 16:10

0 Answers0