Unbiased Random List Generation

Question

Problem Statement

I need to simulate the process of 100 Prisoner Problem, in which each drawer of a cupboard of size 100 contains a unique number from 0 to 99. I initialze the number of drawers as follows:

def arrange_event(n_prisoners=100):
    if (n_prisoners % 2) != 0 or n_prisoners <=0:
        raise ValueError("Number of prisoners must be a postive and even number.")
    drawer = list(range(n_prisoners))
    random.shuffle(drawer)
    return drawer

But I doubt that this implementation (e.g.shuffling from a specific list) will introduce bias to some specific patterns during sampling. A correct implemenation of list initialzation is needed. Any explanation of the bias (if exists) introduced in this implementation will be helpful as well.

Note

I'm not simply asking how to make each initiazation random, the key is that the probabilty of each permutation should be equal in the expcted implemetation, which is completely different from this question.

Here's what I found in the official document, which suggests that the probability distrbution over permutations are biased in my implemetation using random.shuffle().

random.shuffle(x[, random])

Shuffle the sequence x in place. The optional argument random is a 0-argument function returning a random float in [0.0, 1.0); by default, this is the function random().

Note that for even rather small len(x), the total number of permutations of x is larger than the period of most random number generators; this implies that most permutations of a long sequence can never be generated.

Looks like you are trying to do this: https://stackoverflow.com/questions/976882/shuffling-a-list-of-objects I don't think python random would be secure for cryptographic applications, but will probably work fine for this. Do you have a specific problem with the output it is giving? — MikeS159, May 15 '18 at 13:39
My problem is not only making each initialization random, the key is to make the distribution over each permutation a uniform one. — Boris Polonsky, May 15 '18 at 19:08
Then I guess use the system random number generator, which is constantly seeded with more data and thus doesn't really have a "period". See [`random.SystemRandom()`](https://docs.python.org/3/library/random.html#random.SystemRandom). — glibdud, May 15 '18 at 19:19
This function may be useful, but I don't see how this will make shuffling from a fixed sequence produce uniform distribution over permutations of this sequence. Maybe shuffling is not the correct approach, but I didn't know what is. — Boris Polonsky, May 15 '18 at 19:27
Then I guess I don't understand the problem, because `SystemRandom` should generate a uniform distribution, or at least the closest you'll get without using some sort of hardware RNG. — glibdud, May 15 '18 at 19:39
@BorisPolonsky, what do you mean by uniform distribution? Do you want the shuffle to be the same each time? If you mean it in the strict mathematics sense, Rython random does this. — MikeS159, May 15 '18 at 20:24
@MikeS159 By saying uniform distribution I mean each permutation of the sequence (not a single number) should be of identical probability of being generated. I've already figured out a naive approach for doing this. — Boris Polonsky, May 16 '18 at 05:17
@glibdud I figured out a solution for making probability of each permutation identical by sampling without replacement(see the proposal of my answer). I used `random.choice` instead but still I don't know if this solution is equivalent to using `SystemRandom`. — Boris Polonsky, May 16 '18 at 05:24

score 0 · Answer 1 · answered May 16 '18 at 05:14

0

I use a naive approach to make probability of each permutation equal, by sampling without replacement while keeping the order of number sampled. The implementation is as follows:

def arrange_event(n_prisoners=100):
    if (n_prisoners % 2) != 0 or n_prisoners <=0:
        raise ValueError("Number of prisoners must be a postive and even number.")
    drawer = list(range(n_prisoners))
    for i in range(n_prisoners, 1, -1):
        random_index = random.choice(range(i))
        swap_a, swap_b = drawer[random_index], drawer[i-1]
        drawer[random_index], drawer[i-1] = swap_b, swap_a
    return drawer

answered May 16 '18 at 05:14

Boris Polonsky

115
11

Out of curiosity, why do you think this is more uniform than using `random.shuffle()`? – glibdud May 16 '18 at 11:16
1

In fact, looking closer, this is functionally identical to how Python's [`random.shuffle()`](https://github.com/python/cpython/blob/9285835a05ad0319acef111340266c0f85ed549c/Lib/random.py#L286) works. I'm not sure what advantage you think re-implementing it in your code conveys. – glibdud May 16 '18 at 12:12
@glibdud So I suppose you suggestion is that I should specify parameter `random` in `random.shuffle()` with the corresponding method in class `random.SystemRandom` to address the "period" issue, correct? – Boris Polonsky May 17 '18 at 07:49
Or use SystemRandom directly: `random.SystemRandom().shuffle(drawer)`. Of course, this only matters if you plan on processing at least `(2**19937-1)/(n-1)` iterations of the problem, as that's how long it would take to get to a repeated state of the standard PRNG. – glibdud May 17 '18 at 13:08
@glibdid Your solution is truely helpful for me. As for your curiosity, I posted this question because I remembered reading some post saying that some company use shuffle for such purpose (not written in Python), and the simulation results tuned out to be biased. I can't find that post again, that's why I end up posting this question here to ensure that my implementation in Python is unbiased. – Boris Polonsky May 17 '18 at 14:04

Unbiased Random List Generation

Problem Statement

Note

1 Answers1