19

Yesterday i had this interview question, which I couldn't fully answer:

Given a function f() = 0 or 1 with a perfect 1:1 distribution, create a function f(n) = 0, 1, 2, ..., n-1 each with probability 1/n

I could come up with a solution for if n is a natural power of 2, ie use f() to generate the bits of a binary number of k=ln_2 n. But this obviously wouldn't work for, say, n=5 as this would generate f(5) = 5,6,7 which we do not want.

Does anyone know a solution?

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
bountiful
  • 814
  • 1
  • 8
  • 22
  • Here don't you mean given a number `n` which is either 0 or 1, create an f(n) = 0, 1, 2, ..., n - 1? – MoonKnight Nov 03 '12 at 12:39
  • Does this answer your question? [How to generate a random integer in the range \[0,n\] from a stream of random bits without wasting bits?](https://stackoverflow.com/questions/6046918/how-to-generate-a-random-integer-in-the-range-0-n-from-a-stream-of-random-bits) – Peter O. Jul 30 '20 at 03:11
  • Especially my answer: https://stackoverflow.com/a/62920514/815724 – Peter O. Jul 30 '20 at 03:11

4 Answers4

25

You can build a rng for the smallest power of two greater than n as you described. Then whenever this algorithm generates a number larger than n-1, throw that number away and try again. This is called the method of rejection.

Addition

The algorithm is

Let m = 2^k >= n where k is is as small as possible.
do
   Let r = random number in 0 .. m-1 generated by k coin flips
while r >= n
return r

The probability that this loop stops with at most i iterations is bounded by 1 - (1/2)^i. This goes to 1 very rapidly: The loop is still running after 30 iterations with probability less than one-billionth.

You can decrease the expected number of iterations with a slightly modified algorithm:

Choose p >= 1
Let m = 2^k >= p n where k is is as small as possible.
do
   Let r = random number in 0 .. m-1 generated by k coin flips
while r >= p n
return floor(r / p)

For example if we are trying to generate 0 .. 4 (n = 5) with the simpler algorithm, we would reject 5, 6 and 7, which is 3/8 of the results. With p = 3 (for example), pn = 15, we'd have m = 16 and would reject only 15, or 1/16 of the results. The price is needing four coin flips rather than 3 and a division op. You can continue to increase p and add coin flips to decrease rejections as far as you wish.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • How is this going to produce the range 0, 1, 2, ..., n - 2, n - 1? – MoonKnight Nov 03 '12 at 12:37
  • by interpreting `k` consecutive tosses as a binary number in the range `0..2^k-1` – Andre Holzner Nov 03 '12 at 12:52
  • 2
    Python's [`randrange(n)` that returns a uniformly distributed random number from 0 to `n-1`](http://hg.python.org/cpython/file/95d1adf144ee/Lib/random.py#l224) uses this method. – jfs Nov 03 '12 at 13:28
  • 3
    A useful method, but note that it can in theory take an infinitely long time to run. The probability of needing at least m+1 "rolls" (each of k coin-flips) is (2^k - n)^m/2^k, which goes down quickly with increasing m, but never reaches zero. – j_random_hacker Nov 03 '12 at 14:08
  • Now it seems so obvious, but not necessarily when the pressure is on in an interview! Thanks a lot. – bountiful Nov 03 '12 at 15:27
  • Whoops, that probability I gave in my earlier comment should be ((2^k - n)/2^k)^m. – j_random_hacker Nov 04 '12 at 14:06
  • True of course, but at least there is a way of getting the probabily of rejection as low as you like. I added it to the answer. – Gene Nov 04 '12 at 21:03
  • 1. Does p>1 algorithm actually require less coin flips? 2. Does floor(r / p) introduce a bias if p is not 2^m? – jfs Nov 05 '12 at 08:55
  • No bias. In the `p=3` example random values [0..2] map to 0, [3..5] to 1, etc. up to [12..14], which maps to 4. In each case it's a 3 to 1 mapping. Values of 15 are rejected. – Gene Nov 05 '12 at 16:40
  • "Less flips" can only be discussed with expected values. In the example for p=1, you have E(#flips) = 5/8*3*(1 + sum_{k=1}^{\infty} k(3/8)^k) = 3+5/8. With p=3, we have 15/16*4*(1 + sum_{k=1}^{\infty} k(1/16)^k) = 4.0166... So you need a fraction of a bit more on averatge, but the run time is far less variable. – Gene Nov 05 '12 at 17:17
  • 1
    @j_random_hacker It's not too difficult to show that any method that gives a uniform result *must* discard some results. If you generate `z` random bits (for any `z`), you've got `2^z` possible outcomes in total. If `n` isn't a factor of `2^z`, there is no uniform way of partitioning these `2^z` outcomes into `n` different results. The best you can hope for is to assign almost all of them, and leave at most `n-1` unassigned; these will have to be rejected. – chiastic-security Aug 24 '14 at 07:16
  • @chiastic-security: I accept that it must discard some results some of the time, but it might still be possible to design a system in which the maximum number of results that need to be discarded is bounded (ideally by a constant, which would make it O(1)) -- rather than unbounded as is the case for Gene's approach. OTOH it might be impossible, but I think that requires more proof than you've provided so far. – j_random_hacker Aug 25 '14 at 11:56
  • 1
    @j_random_hacker It's impossible, I'm afraid... Suppose it was bounded by `R`. That means that there's an algorithm that requires at most `R` random bits, and produces something uniform from `0` to `n-1`. But as in the previous comment, there are `2^R` possible inputs to the function (regardless of what it uses and what it ignores), and they are uniformly distributed. You can't uniformly partition these into `n` results unless `n` is a factor of `2^R`. – chiastic-security Aug 25 '14 at 12:30
  • @chiastic-security: I see it now, thanks. All we can do is halve or sum probabilities, and the only one we start with is 1. If we want to get a probability that has a prime factor k != 2 in its denominator after simplification, then this can't be generated purely by halving; and if two simplified fractions sum to another simplified fraction with a 3 in its denominator, then at least one of the two summands must also have had a 3 in its denominator, so we can't get there by summing either. (A roundabout proof no doubt, but sometimes I can't let a problem go till I've convinced myself!) – j_random_hacker Aug 25 '14 at 16:18
  • How can this function lead to uniform probability? For eg. if n = 6 , k = 3 and m = 8. To generate 0 . We will have to get 3 coin flips of 0 and the probability of that happening is 0.5 ^ 3. So 0 has a significantly less chance of happening that say 3. For which only 0 1 and 1 has to appear. – Code Junkie Feb 09 '20 at 00:46
  • @CodeJunkie I don't see what you're saying. There are 8 possible outcomes of 3 coin flips. Each of them has a probability of 1/8. One of the outcomes is 000, another is 011. Their probabilities are equal. – Gene Feb 09 '20 at 08:08
2

Another interesting solution can be derived through a Markov Chain Monte Carlo technique, the Metropolis-Hastings algorithm. This would be significantly more efficient if a large number of samples were required but it would only approach the uniform distribution in the limit.

 initialize: x[0] arbitrarily    
 for i=1,2,...,N
  if (f() == 1) x[i] = (x[i-1]++) % n
  else x[i] = (x[i-1]-- + n) % n

For large N the vector x will contain uniformly distributed numbers between 0 and n. Additionally, by adding in an accept/reject step we can simulate from an arbitrary distribution, but you would need to simulate uniform random numbers on [0,1] as a sub-procedure.

fairidox
  • 3,358
  • 2
  • 28
  • 29
0
def gen(a, b):
  min_possible = a
  max_possible = b

  while True:
    floor_min_possible = floor(min_possible)
    floor_max_possible = floor(max_possible)
    if max_possible.is_integer():
      floor_max_possible -= 1

    if floor_max_possible == floor_min_possible:
      return floor_max_possible

    mid = (min_possible + max_possible)/2
    if coin_flip():
      min_possible = mid
    else:
      max_possible = mid
Brett
  • 1
-3
My #RandomNumberGenerator #RNG

/w any f(x) that gives rand ints from 1 to x,  we can get rand ints from 1 to k, for any k:

get ints p & q, so p^q is smallest possible, while p is a factor of x, & p^q >= k;

Lbl A
i=0 & s=1; while i < q {
s+= ((f(x) mod p) - 1) * p^i;
i++;
}
if s > k,  goto A, else return s


//** about notation/terms:
rand = random
int = integer
mod is (from) modulo arithmetic 
Lbl is a “Label”, from the Basic language, & serves as a coordinates for executing code.  After the while loop, if s > k, then “goto A” means return to the point of code where it says “Lbl A”, & resume.  If you return to Lbl A & process the code again, it resets the values of i to 0 & s to 1.  
i is an iterator for powers of p, & s is a sum.  
"s+= foo" means "let s now equal what it used to be + foo".
"i++" means "let i now equal what it used to be + 1".
f(x) returns random integers from 1 to x. **//


I figured out/invented/solved it on my own, around 2008.  The method is discussed as common knowledge here.  Does anyone know since when the random number generator rejection method has been common knowledge?  RSVP.
  • Lbl is a “Label”, from the Basic language, & serves as a coordinates for executing code. After the while loop, if s > k, then “goto A” means return to the point of code where it says “Lbl A”, & resume. If you return to Lbl A & process the code again, it resets the values of i to 0 & s to 1. i is an iterator for powers of p, & s is a sum. s+= foo means let s now equal what it used to be + foo. f(x) returns random integers between 1 & x. Does that help? Cheers. – Nikh Beghzr Jun 30 '18 at 05:20
  • add explanation to your answer not in comments. – Dinesh Ghule Jun 30 '18 at 05:24
  • This is the basic RNG version that can upgrade. I'll soon begin working on the upgraded code. This basic RNG works /w only 1 factorization p of x, however it's possible to use multiple factors to lower calculation time. e.g. use dice, or f(6), to generate rand ints from 1 to 12: 1. use f(6) to generate a result from 1 to 6. 2. use {(f(6) mod 2)-1} to generate a result from 0 to 1. 3. return {(the result from step 1) + (6 * the result from step 2)}. Comparatively, the basic RNG may use p=2 & q=4 – Nikh Beghzr Jun 30 '18 at 15:43
  • e.g. use dice, or f(6), to generate rand ints from 1 to 17 (i.e. “use f(6) to generate rand ints from 1 to 18, accepting only returns from 1 to 17”) 1. use f(6) to get a RNG int from 1 to 6. 2. use {(f(6) mod 3) - 1} to RNG an int from 0 to 2. 3. if {(the result from step 1) + (6 * the result from step 2)} > 17, restart /w step 1, else return that value. – Nikh Beghzr Jun 30 '18 at 16:58
  • Additional comments, including an algorithm for using coin flips to generate (12 sided die, 'd12') random integers from 1 to 12, exceeds stackoverflow comment character count limits, so are linked https://plus.google.com/u/0/113366929351507812798/posts/h2dJSEYQeuU – Nikh Beghzr Jul 02 '18 at 13:15