How to create a uniform distribution over non-power-of-2 elements from n bits?

Question

Assuming I can generate random bytes of data, how can I use that to choose an element out of an array of n elements?

If I have 256 elements I can generate 1 byte of entropy (8 bits), and then use that to pick my element simply be converting it to an integer.

If I have 2 elements I can generate 1 byte, discard 7 bits and use the remaining bit to select my element.

But what if I have 3 elements? 1 bit is too few and 2 is too many. How would I randomly select 1 of the 3 elements with equal probability?

@user515430 Is that the best way? If I have 1025 elements I have to use 11 bits and I have nearly a 50% chance of having to re-pick. — mpen, Oct 28 '14 at 16:15
Which means in average you only have to repeat a single time. But yes, you can do better. It is just more involved. Suppose you want to select one of five. Instead of using 3 bits, you could use 4 bits and only have to repeat one in sixteen tries. — user515430, Oct 28 '14 at 18:38
@user515430 Wouldn't using more bits *increase* the odds of a miss? — mpen, Oct 28 '14 at 19:19
No. With four bits, 0000, 0001, 0010 means first choice, 0011, 0100, 0101 means the second choice, etc. Only 1111 requires a repeat. — user515430, Oct 28 '14 at 21:21
Good exposition here: https://stackoverflow.com/questions/11758809/what-is-the-optimal-algorithm-for-generating-an-unbiased-random-integer-within-a — danh, Feb 14 '23 at 23:49

Peter O. · Answer 1 · 2023-08-31T19:26:28.253

Here is a survey of algorithms to generate uniform random integers from random bits.

J. Lumbroso's Fast Dice Roller in "Optimal Discrete Uniform Generation from Coin Flips, and Applications, 2013. See also the implementation at the end of this answer.
The Math Forum, 2004. See also "Bit Recycling for Scaling Random Number Generators".
D. Lemire, "A Fast Alternative to the Modulo Reduction".
M. O'Neill, "Efficiently Generating a Number in a Range".

Some of these algorithms are "constant-time", others are unbiased, and still others are "optimal" in terms of the number of random bits it uses on average, assuming we have a "true" random generator that can produce unbiased and independent random bits.

For further discussion, see the following answer of mine:

How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?

See "Daniel Lemire. 2019. Fast Random Integer Generation in an Interval. ACM Trans. Model. Comput. Simul. 29, 1, Article 3 (February 2019), 12 pages. DOI:https://doi.org/10.1145/3230636" for an update on your third reference. This is "nearly divisionless," and his benchmarks show that it significantly outperforms the widely-used algorithms from Java and OpenBSD. — pjs, Aug 31 '23 at 20:23

score 1 · Answer 2 · answered Oct 28 '14 at 16:37

You can generate the proper distribution by simply truncating into the necessary range. If you have N elements then simply generate ceiling(log(N))=K random bits. Doing so is inefficient, but still works as long as the K bits are generated randomly.

In your example where you have N=3, you need at least K=2 bits, you have the following outcomes [00, 01, 10, 11] of equal probability. To map this into the proper range, just ignore one of the outcomes, such as the last one. Think of this as creating a new joint probability distribution, p(x_1, x_2), over the two bits where p(x_1=1, x_2=1) = 0, while for each of the others it will be 1/3 due to renormalization (i.e., (1/4)/(3/4) = 1/3 ).

I'm not sure I follow. What do you mean by "just ignore one out of the outcomes"? Do you mean I should generate some new bits and try again or what? — mpen, Oct 28 '14 at 16:57
His method is called "rejection sampling", and yes, it's the only way to get truly unbiased random distributions. — Lee Daniel Crocker, Oct 28 '14 at 20:24

How to create a uniform distribution over non-power-of-2 elements from n bits?

2 Answers2