What is the most rng-efficient uniform random integer algorithm?

Question

This is not a duplicate of #11766794 “What is the optimal algorithm for generating an unbiased random integer within a range?”. Both its top-upvoted answer and its Accepted answer have to do with floating-point extrapolation, which will not even yield perfectly uniform integers. That question is asking after ways to quickly get good approximations of a uniform random integer given a serviceably uniform floating-point rand() value; I am asking this question in the context of a perfectly uniform random integer algorithm given a true random bit generator (deterministic or otherwise; the question applies equally either way).

I am asking, specifically, about theoretical optimality in terms only of efficiency w/r/t random bits used: given a random bit stream, what is the algorithm which consumes the fewest number of bits from it in the process of generating a perfectly uniform random integer within a given range?

For instance, CPython 3.9.0's random.randbelow has at least one trivial inefficiency—it wastes a random bit when called over any power of 2 (including the trivial range):

def randbelow(n):
    "Return a random int in the range [0,n).  Returns 0 if n==0."
    if not n:
        return 0
    k = n.bit_length()  # don't use (n-1) here because n can be 1
    r = getrandbits(k)  # 0 <= r < 2**k
    while r >= n:
        r = getrandbits(k)
    return r

While this is easily enough patched by replacing "not n" with "n <= 1" and "n.bit_length()" with "(n-1).bit_length()", a little analysis shows it leaves even more to be desired:

Say one is generating an integer over the range [0, 4097): half of all the calls to getrandbits(13) will overshoot the value: if the first bit and, say, the second bit are high, it will consume 11 more bits anyway, and discard them when it seemingly didn't need to. So it would seem that this algorithm is obviously nonoptimal.

The best I could come with in an hour this evening was the following algorithm:

def randbelow(n):
    if n <= 1:
        return 0
    k = (n - 1).bit_length() # this is POPCNT for bignums
    while True:
        r = 0
        for i in reversed(range(k)):
            r |= getrandbits(1) << i
            if r >= n:
                break
        else:
            return r

However, I am no mathematician, and just because I fixed the inefficiencies that I could immediately see doesn't give me any confidence that I have just instantly stumbled in an afternoon on the most efficient possible uniform integer selection algorithm.

Say, for instance, the bits are being purchased from a quantum or atmospheric RNG service; or as part of a multi-party protocol in which every individual bit generation takes several round-trips; or on an embedded device without any hardware RNG support… whatever the case may be, I'm only asking the direct question: what algorithm for generating (perfectly) uniformly random integers from a true random bit stream is the most efficient with respect to random bits consumed? (Or, if not known with certainty, what is the best current candidate?)

(I have used Python in these examples because it's what I am working in primarily this season, but the question is by no means specific to any language, except inasfaras the algorithm itself must generalize to numbers above 2⁶⁴.)

take "fewest" in the reasonable sense of amortized, in the limit… however you will, just so long as it doesn't come at the cost of _perfection_ in the distribution's uniformity given a true random bitstream — JamesTheAwesomeDude, Feb 06 '22 at 02:06
[An optimal algorithm for bounded random integers](https://github.com/apple/swift/pull/39143) — rob mayoff, Feb 06 '22 at 02:11
Arithmetic coding (rob's link is a rediscovery without attribution). The precision requirement grows without bound unless you're willing to take a little waste, however. — David Eisenstat, Feb 06 '22 at 04:19
See [my answer](https://stackoverflow.com/a/62902524/815724) in the first question you linked to, or see https://stackoverflow.com/questions/6046918/. In fact, I have given answers to [questions similar to yours](https://stackoverflow.com/search?tab=newest&q=user:815724%20Knuth%20yao). — Peter O., Feb 06 '22 at 04:25
@PeterO. Thanks for the tip! I would not have expected to find such a gem of an answer buried _so far down_ the ranking of #11766794. Further, your answer to #6046918 addresses my question directly, and I'm peeved that I didn't find it while drafting this. I'll mark this question as a duplicate of it. — JamesTheAwesomeDude, Feb 06 '22 at 22:57

score 1 · Answer 1 · answered Feb 06 '22 at 14:15

The Python below implements arithmetic coding in exact arithmetic. This becomes very expensive computationally but achieves entropy + O(1) bits in expectation, which is basically optimal.

from fractions import Fraction
from math import floor, log2
import random


meter = 0


def metered_random_bits():
    global meter
    while True:
        yield bool(random.randrange(2))
        meter += 1


class ArithmeticDecoder:
    def __init__(self, bits):
        self._low = Fraction(0)
        self._width = Fraction(1)
        self._bits = bits

    def randrange(self, n):
        self._low *= n
        self._width *= n
        while True:
            f = floor(self._low)
            if self._low + self._width <= f + 1:
                self._low -= f
                return f
            self._width /= 2
            if next(self._bits):
                self._low += self._width


import collections

if __name__ == "__main__":
    k = 3000
    n = 7
    decoder = ArithmeticDecoder(metered_random_bits())
    print(collections.Counter(decoder.randrange(n) for i in range(k)))
    print("used", meter, "bits")
    print("entropy", k * log2(n), "bits")

Obviously, if you're able to "batch up" the draws, you can get arbitrarily close to 100% efficiency… but what about in expectation for **single-draw, without the ability to preserve state between draws**? (This was the intent behind my original question—I take the blame for not properly expressing it, as I had forgotten about arithmetic coding.) — JamesTheAwesomeDude, Feb 06 '22 at 19:03
@JamesTheAwesomeDude I think that the Fast Dice Roller described in Peter's linked answer is optimal. — David Eisenstat, Feb 06 '22 at 21:25

What is the most rng-efficient uniform random integer algorithm?

1 Answers1