3

I need to generate all possible permutations of p different characters, at length r with repeated characters in the string. I have an additional constraint- I can only have n consecutive repeated characters at any point within each permutation.

Constraints: (1 ≤ p, r, n ≤ 12)

For example: For the letters "AB", with an r value of 4 and an n value of 2 I'd like the following output:

AABA
AABB
ABAA
ABAB
ABBA
BAAB
BABA
BABB
BBAA
BBAB

The maximum number of adjacent identical characters, n, is 2.

I've achieved the desired output by finding the cartesian product of the string, then iterating through the results and eliminating any that don't fit the n value. My approach is incredibly slow though! Here's my code, if needed:

s = "AB"
q = 2
r = 4

pools = [s] * r
result = [[]]
final = []
for pool in pools:
    result = [x+[y] for x in result for y in pool]

check = [''.join([l]*(q+1))for l in s]

for r in result:
    add = True
    z = ''.join(r)
    for c in check:
        if c in z:
            add = False
            break
    if add:
        final.append(z)

print(final)
# ['AABA', 'AABB', 'ABAA', 'ABAB', 'ABBA', 'BAAB', 'BABA', 'BABB', 'BBAA', 'BBAB']

My solution works for the input above, but for larger numbers/string it takes multiple minutes. For example, the following inputs would take a long time using my solution:

s = "ABCDEF"
q = 3
r = 10

I'd like a solution that would preferably take less than a second. I wrote a much faster recursive solution, but that crashed my machine with memory errors for larger numbers- so no recursive solutions please :) Thanks!

Oli Fog
  • 143
  • 6
  • Why is AAAB valid and not AAAA? Does each letter need to appear at-least once? – Omer Tuchfeld Nov 22 '20 at 16:28
  • @OmerTuchfeld Neither AAAB or AAAA are valid, since they contain 3 consecutive identical characters. And no, each letter does not have to appear at least once! – Oli Fog Nov 22 '20 at 16:29
  • Does this answer your question? [Generate all permutations of a list without adjacent equal elements](https://stackoverflow.com/questions/25285792/generate-all-permutations-of-a-list-without-adjacent-equal-elements) – Leonardus Chen Dec 07 '20 at 19:03

2 Answers2

1

Think of it as a tree traversal problem. The nodes are initial segments of the permutations, each of which can be extended in either p ways or p-1 ways (depending on whether or not the last n characters have been the same).

Some non-optimized code:

from string import ascii_uppercase

def children(s,p,r,n):
    chars = ascii_uppercase[:p]
    if len(s)==r:
        return []
    elif len(s) >= n and s[-n:] == s[-1]*n:
        return [s+c for c in chars if c != s[-1]]
    else:
        return [s+c for c in chars]

def extensions(s,p,r,n):
    kids = children(s,p,r,n)
    if kids == []:
        return [s]
    else:
        perms = []
        for kid in kids:
            perms.extend(extensions(kid,p,r,n))
        return perms

def perms(p,r,n):
    return extensions('',p,r,n)

For example,

>>> perms(2,4,2)
['AABA', 'AABB', 'ABAA', 'ABAB', 'ABBA', 'BAAB', 'BABA', 'BABB', 'BBAA', 'BBAB']

It takes perms(6,10,3) around a minute to generate the 58792500 permutations of ABCDEF of length 10 with no repeated block longer than 3. It is recursive, but the depth of the recursion is bounded by `r, so the recursion shouldn't crash (unless the list itself is too large to hold in memory). You could look into non-recursive list traversal algorithms and write it as a generator rather than something which returns a list.

On Edit The following is still recursive, but it uses generators hence is more memory-efficient:

def yield_leaves(s,chars,r,n):
    if len(s) == r:
        yield s
    else:
        avoid = s[-1] if len(s) >= n and s[-n:] == s[-1]*n else ''
        for c in chars:
            if c != avoid:
                yield from yield_leaves(s+c,chars,r,n)

def yield_perms(chars,r,n):
    yield from yield_leaves('',chars,r,n)

I changed the interface to make it closer to your code. For example,

p = list(yield_perms('ABCDEF',10,3))

is comparable to p = perms(6,10,3). It actually seems to be slightly faster.

John Coleman
  • 51,337
  • 7
  • 54
  • 119
  • Thanks so much for the answer! This is very helpful (and faster than mine), but I'll wait a bit before accepting it to see if anyone comes along with a faster nonrecursive solution. – Oli Fog Nov 22 '20 at 17:21
1

You should not exclude recursion for this kind of problem (although iterative approaches are generally faster). The memory problem is one of implementation. It is not inherent to recursion vs iteration.

Here's an example using recursion to define a generator function:

def permute(letters,size,maxConsec):
    if size == 1: yield from letters; return
    for s in permute(letters,size-1,maxConsec):
        for c in letters:
           if size<=maxConsec or s[-1]!=c or not s.endswith(c*maxConsec):
               yield s+c

for p in permute("AB",4,2): print(p)
    
AABA
AABB
ABAA
ABAB
ABBA
BAAB
BABA
BABB
BBAA
BBAB

It is always going to take a long time for larger strings given the sheer number of permutations to be produced. One of the advantages of using a generator function is that it doesn't need to store all the intermediate results in a list. So you can use it as an iterator to count the values for example, or efficiently make a list at the end:

print( sum(1 for _ in permute("ABCDEF",10,3)) ) # 15.0 seconds

print( len(list(permute("ABCDEF",10,3))) )      # 16.2 seconds

58,792,500 permutations (out of 60,466,176 possible without constraint)

An iterative solution (without recursion) that needs to store intermediate result will add memory considerations but can provide some gain in performance:

def permute(letters,size,maxConsec):
    letterMax = [(c,c*maxConsec) for c in letters]
    result    = [""]
    for sz in range(size):
        canRepeat = sz < maxConsec
        result = [ s+c for s in result for c,m in letterMax
                       if canRepeat or s[-1]!=c or not s.endswith(m) ]
    return result

print(len(permute("ABCDEF",10,3))) # 12.9 seconds

in any case, it's not going to work under a second to generate a list with 58 million values.

list(range(58792500)) # 1.9 seconds

[EDIT] here is a further optimized version of the iterative function as discussed in comments and other improvements

from itertools import product
def permute2(letters,size,maxConsec):
    result   = map("".join,product(*[letters]*maxConsec))
    if size == maxConsec: return list(result)
    xLetters = { x:[c for c in letters if c != x] for x in letters }
    xSuffix  = { c*maxConsec for c in letters }
    for _ in range(size-maxConsec):
        result = [ r+c for r in result
                       for c in (xLetters[r[-1]] if r[-maxConsec:] in xSuffix else letters) ]
    return result

print(len(permute2("ABCDEF",10,3))) # 8.9 seconds
Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • The `size<=maxConsec` and `s[-1]!=c` test reduced the time by 27% on the recursive version and by 15% on the iterative one. Using itertools' product as a starting point didn't make a measurable difference. At this point, I will leave any further optimization to the OP so that the main structure of the code doesn't get hidden by extra complexities – Alain T. Dec 07 '20 at 20:31
  • I didn't expect `product` to be faster than what you had, especially since that part creates only 6^3=216 short strings so that doesn't matter among 58 million longer strings. The point was to get rid of the `size<=maxConsec` check. In my testing, that made the iterative solution about 10% faster. In the meantime I went a bit overboard, I think [this](https://preview.tinyurl.com/y37n55mn) might get you under 10 seconds. It categorizes the partial strings so that the `maxConsec` check is only done for whole categories, not for individual strings. – Kelly Bundy Dec 07 '20 at 21:04