-2

Here is the reference implementation and I do not understand why choose 21 here? Thanks.

public static int rand7() {
    while (true) {
        int num = 5 * (rand5() - 1) + (rand5() - 1);
        if (num < 21) return (num % 7 + 1);
    }
}

BTW, I read that question before asking, my specific question is why using 21 here. In that thread, it is not answered. If I missed anything, please feel free to correct. Thanks. :) thanks in advance, Lin

Rotsor
  • 13,655
  • 6
  • 43
  • 57
Lin Ma
  • 9,739
  • 32
  • 105
  • 175
  • 4
    This isn't [tag:python], it's [tag:java]. Anyway this is a duplicate of [Expand a random range from 1–5 to 1–7](http://stackoverflow.com/questions/137783/expand-a-random-range-from-1-5-to-1-7) – smci Oct 03 '15 at 03:41
  • @smci, I read that question before asking, my specific question is why using 21 here. In that thread, it is not answered. If I missed anything, please feel free to correct. Thanks. :) – Lin Ma Oct 03 '15 at 03:47
  • @smci, I mean why not using 14 or 7 here. This is my question. And thanks for correcting, it is not Python. I am using Python. :) – Lin Ma Oct 03 '15 at 03:50
  • 1
    I don't know, but in that other question, they are generating a number <= 21. So it's not even a 7-bit number. Also, the code here is wrong `if (num < 21) ...` should be `>= 21`. It would help if you state the source of this question. – smci Oct 03 '15 at 05:50
  • @smci, thanks and my question is why use 21 other than 14 or 7? 14 and 7 are all times of 7. Any insights are appreciated. :) – Lin Ma Oct 03 '15 at 06:22
  • 1
    this isn't generate a random number < 21. it'll return a number from 1-7 – phuclv Oct 03 '15 at 08:49
  • @LưuVĩnhPhúc, thanks for the inputs and my confusion is how do get this formula? 5 * (rand5() - 1) + (rand5() - 1), why not using other formulas? :) – Lin Ma Oct 04 '15 at 00:21
  • 1
    @LinMa because `rand(5)-1` produces a digit in base 5, so they multiply it by 5 to make it the 5s' unit – phuclv Oct 04 '15 at 03:03
  • @LưuVĩnhPhúc, thanks and wondering why should we multiple by 5, could we multiple by other numbers, saying 3 or 4? For example, 3 * (rand5() - 1) + (rand5() - 1), or 4 * (rand5() - 1) + (rand5() - 1)? – Lin Ma Oct 05 '15 at 05:30

1 Answers1

2

It's because 21 is a multiple of 7.

The term 5 * (rand5() - 1) + (rand5() - 1) produces a number in the range [0, 24] (uniform distribution). This is then used to create a random number in [0, 6] by % 7. However, this does not produce a uniform distribution if you use the entire range. The remainders 0, 1, 2, 3 occur once more than the remainders 4, 5, 6. Therefore, the according numbers that produce one of these remainders are cut off and only the range [0, 20] (< 21) is used. You could equivalently cut off the first 4 numbers (> 3) to produce a uniform distribution.

Nico Schertler
  • 32,049
  • 4
  • 39
  • 70
  • Thanks Nico, and my question is why use 21 other than 14 or 7? 14 and 7 are all times of 7. Any insights are appreciated. :) – Lin Ma Oct 03 '15 at 06:23
  • 2
    Because you want to discard as few numbers as possible in order to return a valid number as soon as possible. 21 is the greatest multiple of 7 which is smaller or equal to 25 (=5^2). – Nico Schertler Oct 03 '15 at 06:34
  • Thanks Nico, smart answer. Then the question is, why we use 5 * (rand5() - 1) + (rand5() - 1), which results in distribution from [0,24], why not use other ranges? I think as long as the range is larger than 7 and uniformly distributed, it is fine? Thanks. – Lin Ma Oct 03 '15 at 07:40
  • 1
    Yes, that's correct. You could use any range. But [0, 24] is probably the most straight-forward (it is a two-digit number in base 5). – Nico Schertler Oct 03 '15 at 19:37
  • Thanks Nico, why we should something base 5? More detailed insights are appreciated. :) – Lin Ma Oct 04 '15 at 00:20
  • 1
    Because then we can just use the generated numbers as digits of another number. You can read more about positional notation on [the Wikipedia page](https://en.wikipedia.org/wiki/Positional_notation). – Nico Schertler Oct 04 '15 at 06:35
  • Thanks Nico, thanks and wondering why should we multiple by 5, could we multiple by other numbers, saying 3 or 4? For example, 3 * (rand5() - 1) + (rand5() - 1), or 4 * (rand5() - 1) + (rand5() - 1) – Lin Ma Oct 05 '15 at 05:31
  • 2
    But then the digits' ranges overlap and you end up with a non-uniform distribution again. E.g. for `3 * (rand5() - 1) + (rand5() - 1)`, the number `4` can be expressed both as `3 * 0 + 4` and as `3 * 1 + 1`. – Nico Schertler Oct 05 '15 at 06:05
  • Thanks Nico, smart and smart. Mark as answered. :) – Lin Ma Oct 05 '15 at 07:16