This is not a duplicate of #11766794 “What is the optimal algorithm for generating an unbiased random integer within a range?”. Both its top-upvoted answer and its Accepted answer have to do with floating-point extrapolation, which will not even yield perfectly uniform integers. That question is asking after ways to quickly get good approximations of a uniform random integer given a serviceably uniform floating-point rand()
value; I am asking this question in the context of a perfectly uniform random integer algorithm given a true random bit generator (deterministic or otherwise; the question applies equally either way).
I am asking, specifically, about theoretical optimality in terms only of efficiency w/r/t random bits used: given a random bit stream, what is the algorithm which consumes the fewest number of bits from it in the process of generating a perfectly uniform random integer within a given range?
For instance, CPython 3.9.0's random.randbelow
has at least one trivial inefficiency—it wastes a random bit when called over any power of 2 (including the trivial range):
def randbelow(n):
"Return a random int in the range [0,n). Returns 0 if n==0."
if not n:
return 0
k = n.bit_length() # don't use (n-1) here because n can be 1
r = getrandbits(k) # 0 <= r < 2**k
while r >= n:
r = getrandbits(k)
return r
While this is easily enough patched by replacing "not n
" with "n <= 1
" and "n.bit_length()
" with "(n-1).bit_length()
", a little analysis shows it leaves even more to be desired:
Say one is generating an integer over the range [0, 4097)
: half of all the calls to getrandbits(13)
will overshoot the value: if the first bit and, say, the second bit are high, it will consume 11 more bits anyway, and discard them when it seemingly didn't need to. So it would seem that this algorithm is obviously nonoptimal.
The best I could come with in an hour this evening was the following algorithm:
def randbelow(n):
if n <= 1:
return 0
k = (n - 1).bit_length() # this is POPCNT for bignums
while True:
r = 0
for i in reversed(range(k)):
r |= getrandbits(1) << i
if r >= n:
break
else:
return r
However, I am no mathematician, and just because I fixed the inefficiencies that I could immediately see doesn't give me any confidence that I have just instantly stumbled in an afternoon on the most efficient possible uniform integer selection algorithm.
Say, for instance, the bits are being purchased from a quantum or atmospheric RNG service; or as part of a multi-party protocol in which every individual bit generation takes several round-trips; or on an embedded device without any hardware RNG support… whatever the case may be, I'm only asking the direct question: what algorithm for generating (perfectly) uniformly random integers from a true random bit stream is the most efficient with respect to random bits consumed? (Or, if not known with certainty, what is the best current candidate?)
(I have used Python in these examples because it's what I am working in primarily this season, but the question is by no means specific to any language, except inasfaras the algorithm itself must generalize to numbers above 264.)