I looked at the "Fast Dice Roller" (FDR) pointed to by @Peter.O, which is indeed simple (and avoids dividing). But each time a random number is generated, this will eat some number of bits and discard the fraction of those bits it does not use.
The "batching"/"pooling" techniques seem to do better than FDR, because the unused fractions of bits are (at least partly) retained.
But interestingly, the DrMath thing you referenced is basically the same as the FDR, but does not start from scratch for each random value it returns.
So the FDR to return 0..n-1 goes:
random(n):
m = 1 ; r = 0
while 1:
# Have r random and evenly distributed in 0..m-1
# Need m >= n -- can double m and double r adding random bit until
# we get that. r remains evenly distributed in 0..m-1
while m < n: r = 2*r + next_bit() ; m = m*2
# Now have r < m and n <= m < n*2
if r < n: return r # Hurrah !
# Have overshot, so reduce m and r to m MOD n and r MOD m
m -= n ; r -= n ;
The DrMath thing goes:
# Initialisation once before first call of random(m)
ms = 1 ; rs = 0
N = ... # N >= maximum n and N*2 does not overflow
# The function -- using the "static"/"global" ms, rs and N
random(n):
m = ms ; r = rs
while 1:
# Same as FDR -- except work up to N not n
while m < N: r = 2*r + next_bit() ; m = m*2 ;
# Now have r < m and m >= N
# Set nq = largest multiple of n <= m
# In FDR, at this point q = 1 and nq = n
q = m DIV n ;
nq = n * q
if r < nq: # all set if r < nq
# in FDR ms = 1, rs = 0
ms = q # keep stuff not used this time
rs = r DIV n # ditto
return r MOD n # hurrah !
# Overshot, so reduce MOD n*q -- remembering, for FDR q == 1
m = m - nq
r = r - nq
which, as noted, is basically the same as FDR, but keeps track of the unused randomness.
When testing I found:
FDR: for 100000 values range=3 used 266804 bits cost=1.6833
DrMath: for 100000 values range=3 used 158526 bits cost=1.0002
Where the cost
is bits-used / (100000 * log2(3))
noting that log2(3) = (1.58496). (So the cost
is the number of bits used divided by the number of bits one would hope to use).
Also:
FDR: for 100000 values range=17: 576579 bits cost=1.4106
DrMath: for 100000 values range=17: 408774 bits cost=1.0001
And:
FDR: for 100000 values ranges=5..60: 578397 bits cost=1.2102
DrMath: for 100000 values ranges=5..60: 477953 bits cost=1.0001
where constructed 100000 values, and for each one chose a range in 5..60
(inclusive).
It seems to me that DrMath has it ! Though for larger ranges it has less of an advantage.
Mind you... DrMath uses at least 2 divisions per random value returned, which gives me conniptions run-time-wise. But you did say you weren't interested in run-time efficiency.
How does it work ?
So, we want a sequence of random values r
to be uniformly distributed in a range 0..n-1
. Inconveniently, we only have a source of randomness which gives us random values which are uniformly distributed in 0..m-1
. Typically m
will be a power of 2 -- and let us assume that n < m
(if n == m
the problem is trivial, if n > m
the problem is impossible). For any r
we can take r MOD n
to give a random value in the required range. If we only use r
when r < n
then (trivially) we have the uniform distribution we want. If we only use r
when r < (n * q)
and (n * q) < m
we also have a uniform distribution. We are here "rejecting" r
which are "too big". The fewer r
we reject, the better. So we should choose q
such that (n * q) <= m < (n * (q-1))
-- so n * q
is the largest multiple of n
less than or equal to m
. This, in turn, tells us that n
"much less" than m
is to be preferred.
When we "reject" a given r
we can throw it all away, but that turns out not to be completely necessary. Also, m
does not have to be a power of 2. But we will get to that later.
Here is some working Python:
M = 1
R = 0
N = (2**63) # N >= maximum range
REJECT_COUNT = 0
def random_drmath(n):
global M, R, REJECT_COUNT
# (1) load m and r "pool"
m = M
r = R
while 1:
# (2) want N <= m < N*2
# have 0 <= r < m, and that remains true.
# also r uniformly distributed in 0..m-1, and that remains true
while m < N:
r = 2*r + next_bit()
m = m*2
# (3) need r < nq where nq = largest multiple of n <= m
q = m // n
nq = n * q
if r < nq:
# (4) update the m and r "pool" and return 0..n-1
M = q
R = r // n
return r % n # hurrah !
# (5) reject: so reduce both m and r by MOD n*q
m = m - nq
r = r - nq
REJECT_COUNT += 1
Must have N
>= maximum range, preferably much bigger. 2**31
or 2**63
are obvious choices.
On the first call of random_drmath()
step (2) will read random bits to "fill the pool". For N = 2**63
, will end up with m = 2**63
and r
with 63 random bits. Clearly r
is random and uniformly distributed in 0..m-1
. [So far, so good.]
Now (and on all further calls of random_drmath()
) we hope to extract a random value uniformly in 0..n-1
from r
, as discussed above. So -- step (3) -- constructs nq
which is the largest multiple of n
which is less than or equal to m
. If r >= nq
we cannot use it, because there are fewer than n
values in nq..m-1
-- this is the usual "reject" criterion.
So, where r < nq
can return a value -- step (4). The trick here is to think of m
and r
as numbers "base-n". The ls "digit" of r
is extracted (r % n
) and returned. Then m
and r
are shifted right by one "digit" (q = m // n
and r // n
), and stored in the "pool". I think that it is clear that at this point r
and m
are still r < m
and r
random and uniformly distributed in 0..m-1
. But m
is no longer a power of 2 -- but that's OK.
But, if r >= nq
must reduce r
and m
together -- step (5) -- and try again. Trivially, could set m = 1 ; r = 0
and start again. But what we do is subtract nq
from both m
and r
That leaves r
uniformly distributed in 0..m-1
. This last step feels like magic, but we know that r
is in nq..m-1
and each possible value has equal probability, so r-nq
is in the range 0..m-nq-1
and each possible value still has equal probability ! [Remember that the 'invariant' at the top of the while
loop is that r
is random and uniformly distributed in 0..m-1
.]
For small n
the rejection step will discard most of r
, but for small n
(compared to N
) we hope not to reject very often. Conversely, for large n
(compared to N
) we may expect to reject more often, but this retains at least some of the random bits we have eaten so far. I feel there might be a way to retain more of r
... but a haven't thought of a simple way to do that... and if the cost of reading one random bit is high, it might be worth trying to find a not-simple way !
FWIW: setting N = 128
I get:
FDR: for 100000 values ranges=3.. 15: 389026 bits cost=1.2881
DrMath: for 100000 values ranges=3.. 15: 315815 bits cost=1.0457
FDR: for 100000 values ranges 3.. 31: 476428 bits cost=1.2371
DrMath: for 100000 values ranges 3.. 31: 410195 bits cost=1.0651
FDR: for 100000 values ranges 3.. 63: 568687 bits cost=1.2003
DrMath: for 100000 values ranges 3.. 63: 517674 bits cost=1.0927
FDR: for 100000 values ranges 3..127: 664333 bits cost=1.1727
DrMath: for 100000 values ranges 3..127: 639269 bits cost=1.1284
so as n
approaches N
the cost per value goes up.