The code you posted is an example of an accept-reject algorithm. The expression
5 * (rand5() - 1) + (rand5() - 1);
generates a random number uniformly distributed between 0 and 24. The first term is 5 times a random number between 0 and 4, yielding one of the set {0, 5, 10, 15, 20} with equal probability. the second is a random number between 0 and 4. Since the two random numbers are presumably independent, adding them gives a random number uniformly distributed between 0 and 24. The second number "fills in" the gaps between the numbers generated by the first term.
The test then rejects any number that is 21 or above and the process repeats. When a number passes the test, it will be a random number uniformly distributed between 0 and 20 (inclusive). This range is then divided into 7 numbers (between 0 and 6) using % 7
and one is added before returning. Since there are 21 numbers in the interval [0, 20] and 21 is divisible by 7, the numbers between 0 and 6 will occur with equal probability.
In a table :
A B num
rand5()-1 rand5()-1 5 * A + B num % 7 + 1
--------- --------- --------- -----------
0 0 0 1
1 0 5 6
2 0 10 4
3 0 15 2
4 0 20 7
0 1 1 2
1 1 6 7
2 1 11 5
3 1 16 3
4 1 21 reject
0 2 2 3
1 2 7 1
2 2 12 6
3 2 17 4
4 2 22 reject
0 3 3 4
1 3 8 2
2 3 13 7
3 3 18 5
4 3 23 reject
0 4 4 5
1 4 9 3
2 4 14 1
3 4 19 6
4 4 24 reject
Note that the last column has exactly three of each number in the range [1, 6].
Another way of understanding the logic is to think in terms of base-5 arithmetic. Since rand5()
generates a random integer between 1 and 5 (inclusive), we can use rand5() - 1
to generate random digits for a base-5 number. We're trying to generate the numbers 1 through 7 (or, equivalently, 0 through 6), so we need at least two base-5 digits just to have seven numbers. So let's just generate a random, two-digit, base-5 number, digit by digit. That's what 5*(rand5() - 1) + (rand5() - 1)
does. We could then do this over and over until we got a number between 0 and 6 (0 and 115). However, it would be more efficient to reject fewer of the two-digit numbers we are generating. We can do this by using everything up to (but not including) the largest two-digit, base-5 number that is a multiple of 7. That happens to be 415, which is 2110. (We exclude 415 itself because we're starting at 0, not 1.) Since we want a number between 0 and 6, we reduce the range using % 7
. (We could also have divided by 3, since there are exactly three, 7-integer ranges in [0, 405]. However, using the remainder operator makes it clear that we're aiming for the range [0, 6].)
P.S. Your proposed solution would not work. While it appears to have the correct range, the expression can only generate five distinct numbers, so it cannot possibly be correct. It can also produce 0 (when rand5()
returns 1). If you were dealing with floating point random number generation, then simple scaling like that would be a good approach. (However, you'd want to shift to 0 before scaling and then shift back to 1 to get the correct lower bound on the range. The expression would be, I think, 1 + 7 * (rand5() - 1) / 5
. But rand5()
returns an integer, not a float.)