4

I'm looking for a way to randomly sample a fixed length subset of all permutations.

import itertools
from random import shuffle

mylist = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T']

APPROACH A

Approach A below suffers from the problem that the permutations are too similar.

a_pre = itertools.permutations(mylist,20)
a = itertools.islice(a_pre,3)

list(a)

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T']

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'T', 'S']

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'S', 'R', 'T']

APPROACH B

Approach B gets me closer to my desired outcome, but here there's always a risk of producing identical ordering between lists, so this approach is not feasible.

#repeat n=3 times

shuffle(mylist)
print(mylist)

['J', 'B', 'M', 'A', 'O', 'C', 'K', 'S', 'H', 'Q', 'N', 'T', 'R', 'D', 'G', 'P', 'I', 'E', 'F', 'L']

['R', 'O', 'C', 'I', 'G', 'E', 'Q', 'L', 'P', 'J', 'F', 'N', 'A', 'B', 'H', 'T', 'D', 'K', 'M', 'S']

['L', 'O', 'I', 'G', 'B', 'E', 'R', 'A', 'D', 'N', 'J', 'S', 'H', 'F', 'K', 'M', 'Q', 'T', 'C', 'P']

Farhan.K
  • 3,425
  • 2
  • 15
  • 26
themachinist
  • 1,413
  • 2
  • 17
  • 22
  • 1
    How many permutations do you want to generate? You can store the ones you already had in a set (e.g. after concatenating the list to a string, or after turning it into a tuple) and avoid the ones you have already used. – Sven Marnach Feb 12 '18 at 11:38
  • For the given list, there are 20! = 2432902008176640000 different permutations, so collisions are really very unlikely. – tobias_k Feb 12 '18 at 11:45
  • 1
    Possible duplicate of [Python random sample with a generator / iterable / iterator](https://stackoverflow.com/questions/12581437/python-random-sample-with-a-generator-iterable-iterator) – Piotr Dobrogost Feb 12 '18 at 11:45

4 Answers4

4

but here there's always a risk of producing identical ordering between lists, so this approach is not feasible.

You can use tuples (since lists aren't hashable) and sets (so that there are no duplicates/identical lists) to get around this:

from random import shuffle

mylist = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T']
myset = set()
while len(myset) < 5: #change 5 to however many you want
     shuffle(mylist)
     myset.add(tuple(mylist))
print([list(x) for x in myset])

Edit: As @tobias_k points out:

For the given list, there are 20! = 2432902008176640000 different permutations, so collisions are really very unlikely.

Sean Breckenridge
  • 1,932
  • 16
  • 26
  • 2
    You might add a bit about the chance of encountering duplicates (very, very low for the given list). – tobias_k Feb 12 '18 at 11:50
3

Consider the itertools random_permutation recipe:

From the docs:

def random_permutation(iterable, r=None):
    "Random selection from itertools.permutations(iterable, r)"
    pool = tuple(iterable)
    r = len(pool) if r is None else r
    return tuple(random.sample(pool, r))

Code

import string

import more_itertools as mit


iterable = string.ascii_uppercase[:-6]
[random_permutation(iterable) for _ in range(3)]

Output

[('M', 'K', 'Q', 'A', 'I', 'J', 'H', 'T', 'C', 'E', 'P', 'L', 'B', 'N', 'G', 'F', 'S', 'D', 'O', 'R'), 
 ('A', 'G', 'I', 'S', 'E', 'T', 'B', 'Q', 'D', 'M', 'C', 'O', 'J', 'H', 'N', 'F', 'K', 'P', 'R', 'L'), 
 ('C', 'S', 'O', 'H', 'I', 'K', 'A', 'G', 'D', 'B', 'R', 'E', 'L', 'T', 'M', 'N', 'F', 'P', 'Q', 'J')]

more_itertools is a third-party library that implements this recipe for you.

pylang
  • 40,867
  • 14
  • 129
  • 121
1

you could use this to generate the number-th lexicographic perutation of N elements:

def permutation_from_int(N, number):
    '''
    get the number-th lexicographic permutation of length N.

    N: the length of the permutation
    0 <= number <= factorial(N) -1: the number of the desired
    permutation
    '''

    # assert 0 <= number < factorial(N)

    ret = [None] * N
    select = list(range(N))

    for i in range(N - 1, -1, -1):
        index, number = divmod(number, factorial(i))
        element = select[index]
        ret[N - 1 - i] = element
        select.remove(element)
    return ret

then you just have to generate (and keep a set of - if you want to avoid duplicates) random integers that represent the permutations:

N_TESTS = 17
strg = 'ABCD'
N = len(strg)
N_MAX = factorial(N)
seen = set()

for _ in range(N_TESTS):
    number = randrange(N_MAX)
    while number in seen:
        number = randrange(N_MAX)
    seen.add(number)
    perm = permutation_from_int(N, number)
    print(''.join(strg[i] for i in perm))

note that this may loop forever if the number of test is bigger that the space of all the permutations...

which prints (e.g.):

DACB
DBCA
BADC
BDCA
DCAB
DABC
CADB
DBAC
DCBA
...

but as mentioned in the other answers: if you have a permutation of 20 elements the chance of hitting a repeated permutation is very small!

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
0

I think your question is a special case of one I had, for k=N Based on this, the two solutions stated there should apply. The first one is a tad slow :)

So the random sampling one (Which you also hint at your question, just discard duplicates...) seems to be the only answer for now.

It would be very interesting to see if there is a generative solution to either this question, or the more general one... Here's the code based on that answer:

import itertools
from random import shuffle

mylist = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T']
n=len(mylist)
k = n
m = 5
samples = set()
tries = 0
while len(samples) < m:
    samples.add(tuple(random.sample(mylist,k)))
    print (len(samples))

print(samples)
print(tries)
ntg
  • 12,950
  • 7
  • 74
  • 95