1

I am given a uniform integer random number generator ~ U3(1,3) (inclusive). I would like to generate integers ~ U5(1,5) (inclusive) using U3. What is the best way to do this?

This simplest approach I can think of is to sample twice from U3 and then use rejection sampling. I.e., sampling twice from U3 gives us 9 possible combinations. We can assign the first 5 combinations to 1,2,3,4,5, and reject the last 4 combinations.

This approach expects to sample from U3 9/5 * 2 = 18/5 = 3.6 times.

Another approach could be to sample three times from U3. This gives us a sample space of 27 possible combinations. We can make use of 25 of these combinations and reject the last 2. This approach expects to use U3 27/25 * 3.24 times. But this approach would be a little more tedious to write out since we have a lot more combinations than the first, but the expected number of sampling from U3 is better than the first.

Are there other, perhaps better, approaches to doing this?

I have this marked as language agnostic, but I'm primarily looking into doing this in either Python or C++.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
roulette01
  • 1,984
  • 2
  • 13
  • 26
  • I'd suggest that your second approach is optimal. The discard rate (2/27 = 7.4%) is the best that can be achieved with 32-bit arithmetic. (Nothing beats that until you reach 5^15 and 3^22, which reduces the discard rate to 2.75%) – r3mainer Sep 07 '20 at 10:43
  • @r3mainer Why does nothing beat it until you reach $5^15$ and $3^22$. Not sure what these numbers mean in the context of this problem. – roulette01 Sep 07 '20 at 14:36
  • If you generate U(1,27) from three rolls of U(1,3), you have to discard, on average, 2 out of every 27 numbers. This is a very low discard rate. In general, you're looking for values of p and q such that 1 - 5^p / 3^q is non-negative but as small as possible. For p=2 and q=3, the value is 0.074 (which means you have to discard 7.4% of your U(1,27) values). There is no combination of p and q that improves on this until you get to p=15 and q=22, where it decreases to 0.0275 (i.e., a discard rate of 2.75%), but at the expense of considerable added complexity. – r3mainer Sep 07 '20 at 20:34

3 Answers3

1

For the range [1, 3] to [1, 5], this is equivalent to rolling a 5-sided die with a 3-sided one.

However, this can't be done without "wasting" randomness (or running forever in the worst case), since all the prime factors of 5 (namely 5) don't divide 3. Thus, the best that can be done is to use rejection sampling to get arbitrarily close to no "waste" of randomness (such as by batching multiple rolls of the 3-sided die until 3^n is "close enough" to a power of 5). In other words, the approaches you give in your question are as good as they can get.

More generally, an algorithm to roll a k-sided die with a p-sided die will inevitably "waste" randomness (and run forever in the worst case) unless "every prime number dividing k also divides p", according to Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner. For example:

  • Take the much more practical case that p is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and k is arbitrary. In this case, this "waste" and indefinite running time are inevitable unless k is also a power of 2.
  • This result applies to any case of rolling a n-sided die with a m-sided die, where n and m are prime numbers. For example, look at the answers to a question for the case n = 7 and m = 5.

See also this question: Frugal conversion of uniformly distributed random numbers from one range to another.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
1

You do not need combinations. A slight tweak using base 3 arithmetic removes the need for a table. Rather than using the 1..3 result directly, subtract 1 to get it into the range 0..2 and treat it as a base 3 digit. For three samples you could do something like:

function sample3()
  result <- 0
  result <- result + 9 * (randU3() - 1)  // High digit: 9
  result <- result + 3 * (randU3() - 1)  // Middle digit: 3
  result <- result + 1 * (randU3() - 1)  // Units digit: 1
  return result
end function

That will give you a number in the range 0..26, or 1..27 if you add one. You can use that number directly in the rest of your program.

rossum
  • 15,344
  • 1
  • 24
  • 38
  • Ah, this seems to be what I alluded to with my second approach, but more nicely written out. So essentially here, the first, second, third samples represent the most significant to least significant bits of a base 3 integer? – roulette01 Sep 07 '20 at 14:39
0

Peter O. is right, you cannot escape to loose some randomness. So the only choice is between how expensive calls to U(1,3) are, code clarity, simplicity etc.

Here is my variant, making bits from U(1,3) and combining them together with rejection

C/C++ (untested!)

int U13(); // your U(1,3)

int getBit() { // single random bit
    return (U13()-1)&1;
}

int U15() {
    int r;
    for(;;) {
        int q = getBit() + 2*getBit() + 4*getBit(); // uniform in [0...8)
        if (q < 5) {   // need range [0...5)
            r = q + 1; // q accepted, make it in [1...5]
            break;
        }
    }
    return r;
}
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64