3

I need to generate combination from "a" to "]]]]]]" for example, in order to it I use this python script which works well.

import itertools

DATA_ALPHA_NUM = 
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789&-
()@=+;/!%$\\'\",.<>*^{}#~_[]"
b = 10

for i in range(1,int(b)+1):
   for e in itertools.combinations(DATA_ALPHA_NUM,i): print(''.join(e))

But now I need to do the inverse, for exemple : if I give "1" to the new script it will output "a", if I give 90 it will ouput "]" etc.

I wrote several scripts which worked fine for less than 737191 combinations but they were not good after.

EDIT : someone write something like that then delete it post It was almost perfect ..

alphaNumList = list(DATA_ALPHA_NUM)
alphaNumList.insert(0, "")

result = ["".join(item) for item in 
itertools.islice(itertools.product(alphaNumList, repeat=6), 0,1)]
Switch
  • 31
  • 7
  • if you have the memory, you could store the combinations in a dictionary by index. But that consumes a lot of memory, yes. – Jean-François Fabre Jul 02 '17 at 16:19
  • related: https://stackoverflow.com/questions/14455634/find-the-index-of-a-given-combination-of-natural-numbers-among-those-returned – Jean-François Fabre Jul 02 '17 at 16:20
  • @Jean-FrançoisFabre I knew but I want to to it without loop I need to get the index very quickly .. – Switch Jul 02 '17 at 16:28
  • Someone write something like that "result = ["".join(item) for item in itertools.islice(itertools.product(alphaNumList, repeat=6), 0,1)]" was almost perfect but stop too soon – Switch Jul 03 '17 at 16:58
  • Yes but the idea was here – Switch Jul 03 '17 at 17:38
  • I know the guy. I fwded ypur question to him because he did smth similar – Jean-François Fabre Jul 03 '17 at 18:44
  • Thank you ! Hope he could help me again ! – Switch Jul 03 '17 at 19:29
  • the answer came back, and now it works :) but in your question you're using `combinations`, you have to use `itertools.product(DATA_ALPHA_NUM,repeat=i):` for it to match. – Jean-François Fabre Jul 05 '17 at 08:45
  • La dernière réponse de votre ami ne m'a pas aidée au final. Car le problème survenait vers 8192 ou on revenait à "a" au lieu de aaa mais je l'ai fixé en faisant un set pour supprimer les doublons ! En tout cas merci beaucoup à vous deux ! – Switch Jul 06 '17 at 19:56

2 Answers2

1

Overview

The key to this is to walk through the cumulative combinations until the index is reached.

Solution

from math import factorial

def comb(n, r):
    'Combinations of n things taken r at a time'
    return factorial(n) // (factorial(r) * factorial(n - r))

def nth_combination(population, r, index):
    'Equivalent to list(combinations(population, r))[index]'
    n = len(population)
    c = comb(n, r)
    if index < 0:
        index += c
    if index < 0 or index >= c:
        raise IndexError
    if r < 0 or r > n:
        raise ValueError
    result = []
    while r:
        c, n, r = c*r//n, n-1, r-1
        while index >= c:
            index -= c
            c, n = c*(n-r)//n, n-1
        result.append(population[-1-n])
    return tuple(result)

Optimization

If speed is a concern, it is possible to to build faster version of the comb() function.

One way is to precompute the factorials and then look them up when needed:

fact = [1]
for i in range(1, 1000):
    fact.append(fact[-1] * i)

def comb(n, r):
    return fact[n] // (fact[r] * fact[n - r])

And there is another way that avoids large factorials altogether and doesn't require auxiliary storage:

def comb(n, r):
    c = 1
    r = min(r, n-r)
    for i in range(1, r+1):
        c = c * (n - r + i) // i
    return c

How it works

Start by breaking the combinations into its component groups:

def groups(n, r):
    return [comb(n-i-1, r-1) for i in range(n-r+1)]

>>> comb(8, 3)
56
>>> groups(8, 3)
[21, 15, 10, 6, 3, 1]

This means that when you run itertools.combinations('ABCDEFGH', 3) for n=8 letters taken r=3 at a time, there are 56 combinations. The first 21 start with A, the next 15 start with B, the next 10 start with C, the next 6 start with D, the next 3 start with E, and the last 1 starts with F.

Say you want to find the 25th combination out of the 56. That falls in the second group, so your first letter is B.

And since 25 - 21 is 4, you'll then need to find the 4th combination in the 15 members of "B" group defined by itertools.combinations('CDEFGH', 2). Just repeat the above process until all the letters are extracted.

Testing

Here's a test to make sure it gives the expected results:

from itertools import combinations

population = 'ABCDEFGH'
for r in range(len(population) + 1):
    for i, expected in enumerate(combinations(population, r)):
        actual = locate(list(population), r, i)
        assert expected == actual
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
1

You don't want COMBINATIONS. Indeed, you want "aa". But with combinations, as you never pick TWICE same item, that won't happen.

So here a proper version with "cumulative product", indeed, as Raymond does with combinations, i have to count (90, 90 + 90 **2, 90 + 90 **2 + 90 **3, ...) to find which is the good power corresponding to the combination i'm tracking.

Notice that it's not optimized as i'm slicing over product... just to retrieve one value!

import itertools

alphaNumList = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789&-()@=+;/!%$\\'\",.<>*^{}#~_[]")

cumulative = [len(alphaNumList)]
for i in range(1, 10):
    cumulative.append(len(alphaNumList) ** (i+1) + cumulative[i - 1])


def getCombiFromIndex(combiNumber):
    p = 0
    while cumulative[p] < combiNumber:
        p += 1  # WARNING : not robust to combi greater than (10,90) in my case :)
    rest = combiNumber - 1 - (cumulative[p - 1] if p > 0 else 0)
    return "".join([item for item in itertools.islice(itertools.product(alphaNumList, repeat=p + 1), rest, rest + 1)][0])


print(getCombiFromIndex(1))  # "a"
print(getCombiFromIndex(90))  # "]"
print(getCombiFromIndex(91))  # "aa"
print(getCombiFromIndex(800064))  # "ah+1"

UPDATE : I've added the method to retrieve list between 2 indexes, based on same concept but in this case, slice is better used :)

def getCombiListBetween(fromNumber, toNumber):
    combiList = []
    if fromNumber < toNumber:
        pF, pT = 0, 0
        while cumulative[pF] < fromNumber:
            pF += 1
        while cumulative[pT] < toNumber:
            pT += 1
        start = fromNumber - 1 - (cumulative[pF - 1] if pF > 0 else 0)
        end = toNumber - 1
        for p in range(pF, pT + 1):
            combiList += ["".join(item) for item in itertools.islice(itertools.product(alphaNumList, repeat=p + 1), start, min(cumulative[p], end) + 1)]
            start = 0
            end -= cumulative[p]
    else:
        combiList.append(getCombiFromIndex(fromNumber))
    return combiList

print(getCombiListBetween(8189, 8191)) # ['][', ']]', 'aaa']
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Vince
  • 61
  • 8