2

Very basic question but I can't seem to find the answer on Google. A standard PRNG will generate a sequence of random bits. How would I use this to produce a sequence of random integers with a uniform probability distribution in the range [0, N)? Moreover each integer should use (expected value) log_2(N) bits.

jcai
  • 3,448
  • 3
  • 21
  • 36
  • Check [this](http://stackoverflow.com/questions/20804899/how-to-generate-an-un-biased-random-number-within-an-arbitrary-range-using-the-f) question and the accepted answer. – Alexandru Barbarosie Aug 11 '14 at 22:56
  • In the accepted answer, the first method is biased (which is explained), the second is equivalent to two answers below (use ceiling(log2(N)) bits and retry if too high), and the last method is equivalent to my own answer, which is prohibitively expensive in practice. Converting a sequence of bits to base N to produce the desired integers is extremely expensive. – jcai Aug 12 '14 at 12:12
  • Does this answer your question? [How to generate a random integer in the range \[0,n\] from a stream of random bits without wasting bits?](https://stackoverflow.com/questions/6046918/how-to-generate-a-random-integer-in-the-range-0-n-from-a-stream-of-random-bits) – Peter O. Jul 17 '20 at 05:16

5 Answers5

3

If you want a random number between 1 and N :

  • you calculate how many bits you would need to turn N into a binary number. That's :

    n_bits = ceiling(log_2(N))
    

    where ceiling is the "round up" operation. (ex : ceiling(3) = 3, ceiling(3.7) = 4)

  • you pick the first n_bits of your random binary list and change them into a decimal number.

  • if your decimal number is above N, well... you discard it and try again with the n_bits next bits until it works.

Exemple for N = 12 :

  • n_bits = ceiling(log_2(12)) = 4

  • you take the 4 first bits of your random bit sequence which might be "1011"

  • you turn "1011" into a decimal number which gives 13. That's above 12, no good. So :

  • take the 4 next bits in your random sequence which might be "1110".

  • turn '1110' into a decimal which gives 7. That works !

Hope it helps.

nico7et8
  • 217
  • 1
  • 8
  • For bad N this takes 2*ceil(log2(N)) bits. Suppose N=2^10+1=1025, so you use 11 bits each time. You have a ~1/2 chance of rejection so it takes expected value of 2 tries, each costing 11 bits. – jcai Aug 11 '14 at 03:08
  • Yup, that's true. But if you chose not to discard the sequence (and change it to make it smaller) you take the risk of introducing a bias in your random choice. But you're right, not effective at all for some Ns. – nico7et8 Aug 11 '14 at 14:59
  • 1
    @user49164 failing once is 50%, failing twice in the row is 25%, 3 times, 12.5% and so on... check [this](http://stackoverflow.com/questions/20804899/how-to-generate-an-un-biased-random-number-within-an-arbitrary-range-using-the-f/20818831#20818831) link for more info. – Alexandru Barbarosie Aug 11 '14 at 22:55
  • 2
    @Alexandru which makes the expected number of tries equal to 2, as I said. – jcai Aug 12 '14 at 09:39
0

Actually most standard PRNGs such as linear congruential generators or Mersenne twister generate sequences of integer values. Even generalized feedback shift register techniques are usually implemented at the register/word level. I don't know of any common techniques that actually operate at the bit level. That's not to say they don't exist, but they're not common...

Generating values from 1 to N is usually accomplished by taking the integer value produced modulo the desired bound, and then doing an acceptance/rejection stage to make sure you aren't subject to modulo bias. See Java's nextInt(int bound) method, for example, to see how this can be implemented. (Add 1 to the result to get [1,N] rather than [0,N-1].)

pjs
  • 18,696
  • 4
  • 27
  • 56
  • PRNGs usually don't operate at the bit level but if the PRNG operation is relatively expensive then we don't want to waste any output, so we have to look at the output at the bit level. – jcai Aug 11 '14 at 03:31
  • It's generally more expensive to try to operate at the bit level for an algorithm designed at the word level. That's why Java's implementation just throws away the extra bits, and rejects and moves on to the next int if modulo bias enters the picture. What kind of PRNG are you using that's so very expensive? – pjs Aug 11 '14 at 03:38
0

Theoretically this is possible. Find a, b such that 2^a > N^b but is very close. (This can be done by iterating through multiples of log2(N).) Take the first a bits, and, interpreting it as a binary number, convert it to base N (also checking that the number is less than N^b). The digits give b terms of the desired sequence.

The problem is that converting to base N is very expensive and will cost more than essentially any PRNG, so this is mostly a theoretical answer.

jcai
  • 3,448
  • 3
  • 21
  • 36
-1

Start with the range [0, N-1] then use 0s and 1s to perform a binary search:

  • 0: lower half
  • 1: upper half

e.g. With N = 16, you start with [0, 15], and the sequence 0, 1, 1, 0 would give:

  1. [0, 7]
  2. [4, 7]
  3. [6, 7]
  4. [6]

If N is not a power of 2, then in any iteration, the length of the list of remaining numbers could be odd, in which case a decision needs to be made to include the middle number as part of the lower half or the upper half. This can be decided right at the start of the algorithm. Roll once: 0 means include all instances of middle numbers to the lower half, and 1 means include all instances of middle numbers to the right half.

I think this is at least closer to the uniform distribution that you are asking for compared to the common method of generating log(N) bits and taking that or taking the mod N of it.

To illustrate what I mean, using my method to generate a number in the range [0, 9]:

To generate 0
0: 0, 0, 0, 0
1: 0, 0, 0

To generate 1
0: 0, 0, 0, 1
1: 0, 0, 1

To generate 2
0: 0, 0, 1
1: 0, 1, 0

To generate 3
0: 0, 1, 0
1: 0, 1, 1, 0

To generate 4
0: 0, 1, 1
1: 0, 1, 1, 1

To generate 5
0: 1, 0, 0, 0
1: 1, 0, 0

To generate 6
0: 1, 0, 0, 1
1: 1, 0, 1

To generate 7
0: 1, 0, 1
1: 1, 1, 0

To generate 8
0: 1, 1, 0
1: 1, 1, 1, 0

To generate 9
0: 1, 1, 1
1: 1, 1, 1, 1

The other easy answer is to generate a large enough binary number such that taking mod N does not (statistically) favor some numbers over others. But I figured that you would not like this answer either because judging from your comments to another answer, you seem to be taking into account efficiency in terms of number of bits generated.

In short, I am not sure why I was downvoted for this answer as this algorithm seems to provide a nice distribution compared to the number of bits it uses (~log(N)).

wookie919
  • 3,054
  • 24
  • 32
  • This doesn't give a uniform distribution when N is not a power of 2. – jcai Aug 11 '14 at 08:55
  • @user49164 Because if N is not a power of 2, in each iteration, the length of the list of remaining numbers can be an odd number, in which case we don't know what to do with the middle number, correct? Or are there other reasons that I have missed? – wookie919 Aug 11 '14 at 09:02
  • In the N=10 example you gave you can see that there is a 1/8 chance of getting 2 and 7 and a 3/32 chance of getting anything else, rather than 1/10 for each. I think this level of error is not acceptable for most applications. – jcai Aug 12 '14 at 09:36
  • @user49164 When N is not a power of 2, none of the answers can claim a *PERFECT* uniform distribution with log(N) bits. For my algorithm, as N gets larger, the differences in chances of getting each number will reduce. I gave my best answer but obviously it was not good enough for the downvoter, and you are obviously in search of an algorithm that gives prefect uniform distribution with log(N) bits. Good luck with that, I am also genuinely interested in the final answer. – wookie919 Aug 12 '14 at 09:48
-1
  1. Calculate the number of bits required for N (= location of the most significant bit with value 1) - let's call it k.
  2. Take the first k bits from your input stream of bits - let's call it number X.
  3. Result = X mod N.
  4. Propagate to the next set of k bits and repeat from step 2 for next random number generation.

Alternatively, for better distribution, this can be applied instead of step 3:

  • Ratio = N/2k
  • Result = X * Ratio
SomeWittyUsername
  • 18,025
  • 3
  • 42
  • 85