Expand a random range from 1–5 to 1–7

Question

Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7.

It proved to be an unexpectedly interesting problem, I still think how to 1) do it in fixed time and 2) not spoil the uniform distribution (if there was) — eugensk, Sep 27 '08 at 05:20
We had the similar problem while choosing one player out of 5 with a dice. We threw the dice in turns, one who gets the max score is choosen. The uniformity was achived, but not time constantness :) — eugensk, Sep 27 '08 at 05:29
Would I get downvoted if I posted an answer saying that the problem doesn't mandate you have to use the given function and just write one that returns 1-7 randomly? — Doctor Blue, Jan 18 '11 at 02:45
@kiwixz, that will produce "between 1 and 7", but you won't get 3 or 6: {1: 19.96, 2: 20.02, 4: 20.01, 5: 19.99, 7: 20.02} rough percentages testing manually. 7*.2, 7*.4, 7*.6, 7*.8, 7*1. — pythonlarry, Apr 18 '16 at 20:40
Obligatory xkcd: [xkcd.com/221](http://xkcd.com/221) – as said by @Steven Rumbalski on a challenge seeming to be a very close duplicate. — sergiol, Nov 21 '17 at 23:30

Rob McAfee · Answer 1 · 2010-10-05T23:54:21.507

600

This is equivalent to Adam Rosenfield's solution, but may be a bit more clear for some readers. It assumes rand5() is a function that returns a statistically random integer in the range 1 through 5 inclusive.

int rand7()
{
    int vals[5][5] = {
        { 1, 2, 3, 4, 5 },
        { 6, 7, 1, 2, 3 },
        { 4, 5, 6, 7, 1 },
        { 2, 3, 4, 5, 6 },
        { 7, 0, 0, 0, 0 }
    };

    int result = 0;
    while (result == 0)
    {
        int i = rand5();
        int j = rand5();
        result = vals[i-1][j-1];
    }
    return result;
}

How does it work? Think of it like this: imagine printing out this double-dimension array on paper, tacking it up to a dart board and randomly throwing darts at it. If you hit a non-zero value, it's a statistically random value between 1 and 7, since there are an equal number of non-zero values to choose from. If you hit a zero, just keep throwing the dart until you hit a non-zero. That's what this code is doing: the i and j indexes randomly select a location on the dart board, and if we don't get a good result, we keep throwing darts.

Like Adam said, this can run forever in the worst case, but statistically the worst case never happens. :)

edited Oct 05 '10 at 23:54

answered May 08 '09 at 22:59

Rob McAfee

581
1
5
11

Is it possible to rewrite this algorhitm that in case result == 0 you need to call rand5() only once? – Luka Rahne Nov 26 '11 at 16:43
8

I understood the logic behind this solution but can't comprehend that how does it result in uniform probability? Can someone explain the math? – user1071840 Nov 15 '12 at 08:37
7

@user1071840 - if `rand5` is uniform, every cell in the `vals` grid has an equal probability of being picked. The grid contains exactly three copies of each integer in the interval [1, 7], plus four zeroes. So the "raw" stream of results tends to an even mixture of [1, 7] values, plus some zeroes that occur a tad more frequently than any individual allowed value. But that doesn't matter because the zeros are stripped out, leaving just an even mixture of [1, 7] values. – Daniel Earwicker Nov 22 '12 at 15:40
@DanielEarwicker. Thanks for the explanation. I'm curious, why can't we do like this: (( rand5()*5 ) -4 ) % 7 + 1. rand5()* 5 translates the range to 1-25..-4 changes it to (1..21)..%7 takes it to 0-6..such that each number appears 3 times..and total 21 numbers.. +1 to translate to 1-7..prob of any number being picked is 3/21.. – user1071840 Dec 07 '12 at 00:13
I'm doing the same thing..it's just being done by invoking rand5() only once. – user1071840 Dec 07 '12 at 00:14
4

The shortcut way to realising the problem with that: if you're only calling rand5() once, then you only have 5 possible outcomes. There is obviously no way to turn that into more than 5 possible outcomes without adding more randomness. – Daniel Earwicker Dec 07 '12 at 10:43
3

The longer version: rand5() can only have the values (1, 2, 3, 4, 5). Therefore rand5() * 5 can only have the values (5, 10, 15, 20, 25), which is not the same as a complete range (1...25). If it did, subtracting 4 would make it (-3...21), but in this case it becomes (1, 6, 11, 16, 21), so the end points are correct but there are four big holes: (2..5), (7..10), (12..15), (17..21). Finally you do mod 7 and add 1, giving (2, 7, 5, 3, 1). So neither 4 nor 6 ever occur. But (see above shortcut) we knew there could only be 5 numbers in the resulting range all along, so there had to be two gaps. – Daniel Earwicker Dec 07 '12 at 10:51
can't we add just two randoms and give result mod 7 ? ( rand5()+rand(5) % 7 ) – Dineshkumar Jul 06 '13 at 14:30
“But statistically the worst case never happens” quite the contrary: statistically the worst case *can* always happen but is not very likely. In fact, here the worst case is *more* likely than getting a specific value every time forever (since there are four zeros vs. three of any other digit). – Andrew Marshall Jul 30 '13 at 00:22
By the analogy above, why can't we get away with just two rows: {{1,2,3,4,5},{6,7,0,0,0}}, and then call again if it's one of the zeros in the second row? Why do we need all five rows of the dartboard? – gzak Oct 29 '14 at 05:14
1

Ah, because we only have rand5(), not rand2() :-) – gzak Oct 29 '14 at 05:14
2

Is there a reason why you placed the zeros at the end? – TheOne May 20 '15 at 19:52
A variant to this which does always return a value would have (instead of the 5x5 matrix) an array vals[21] = [1,2,3,4,5,6,7,1,2,3,4,5,6,7,1,2,3,4,5,6,7], loop five times over a line i += rand5() and then return vals[i - 5]. This way, the i variable would contain a randomly distributed integer from the range <5,25> and subtracting 5 from its value will randomly choose a number from the vals array above. – David Sep 02 '15 at 12:02
using `do while` will be clearer and you don't need to check and initialize the result first – phuclv Oct 03 '15 at 09:12
this should be the excepted answer! @TheOne: it's because the remaining elements won't randomly distribute the numbers 1 - 7 (there's not enough room for 1...7) – rookie Oct 28 '15 at 18:57
1

@AndrewMarshall "Loop never terminates" is worst case. The probability of the loop never terminating is 0. Each trial has chance to continue at 4/25. Each trial is independent. The probability that the loop will go for n trials is (4/25)^n. lim n to infty (4/25)^n = 0. The loop "will run forever" event has probability of occurring 0, which colloquially means "never happen". "the worst case is more likely than getting a specific value every time forever" This may be a logically valid statement, but it's does nothing to refute or support the point you're addressing. – r12 Jun 03 '16 at 01:52
It would be great to get a short output as a summary here etc as .gif animation, linked to your description. Possibly an algebraic presentation too. – Léo Léopold Hertz 준영 Aug 09 '16 at 12:59
@Dineshkumar Yes, if you don't care if some numbers are more common than others. In other words, that gives a non-uniform distribution. – Paul Aug 19 '16 at 21:07
Is it valid statistically to discard 0s (any value) out of rand5()? Rob's algorithm consumes (5x5-4)=21/25=84% of stream values coming out of rand5() since each cell has an equal p of occurrence. Of course, I get the result rand7() is uniform and it is a solution. Is there another test besides checking that 0-6 buckets are uniform that can be done on the source and result stream? – Vortex Oct 15 '16 at 19:22
@David That would give a non-uniform distribution, where 4 and 5's are much more common that 1's and 7's. – Paul Nov 05 '16 at 06:19
this answer is good and really similar to this https://math.stackexchange.com/a/1314479/438622 – daniel Apr 24 '17 at 15:16
why can't we just go up to 15. (`rand5() +rand5()+rand5() -1`) % 7 + 1 – user3833308 Sep 20 '17 at 17:42
Just thought of another way solving this, and shared that here: https://stackoverflow.com/a/47603332/3547167. Would appreciate any comments. – Shambho Dec 02 '17 at 00:31
Anyone has the version that doesn't need a matrix (memory)? I think it can be done by math tricks without a matrix. Thanks. – Denly Apr 02 '20 at 21:08
### rand7 initialized at 0 ### `for i range(5): rand7 += rand5();` ### rand7 between 5 and 25 uniformly ### `rand7 -= 5;` ### rand7 b/w 0 and 20 uniformly ### `rand7 /= 3`; ### rand7 b/w 0 and 6 uniformly ### `rand7 += 1;` ### rand7 b/w 1 and 7 uniformly ### – Ladi Oyeleye Dec 18 '21 at 19:20

score 369 · Answer 2 · answered Sep 26 '08 at 04:41

369

There is no (exactly correct) solution which will run in a constant amount of time, since 1/7 is an infinite decimal in base 5. One simple solution would be to use rejection sampling, e.g.:


int i;
do
{
  i = 5 * (rand5() - 1) + rand5();  // i is now uniformly random between 1 and 25
} while(i > 21);
// i is now uniformly random between 1 and 21
return i % 7 + 1;  // result is now uniformly random between 1 and 7

This has an expected runtime of 25/21 = 1.19 iterations of the loop, but there is an infinitesimally small probability of looping forever.

answered Sep 26 '08 at 04:41

Adam Rosenfield

390,455
97
512
589

7

the -1 is not needed if the >21 is flipped to >26 b/c it doesn't matter where i's lower bound maps to, – BCS Jan 15 '09 at 18:01
3

Uh, can anyone explain how this really works and produces a uniform distribution? The Wikipedia page on rejection sampling was not of much help. – Sundar R May 04 '09 at 05:50
2

I add an answer below that runs faster than this one by about 7%. – Eyal May 04 '09 at 09:29
26

My take on explaining why this is correct: Say that I want to write a program that outputs a stream of uniform random numbers from1 to 25; for that I'd just return 5 * (rand5() - 1) + rand5() as in the code in the answer. Now, if I want to build a stream of uniform random numbers between 1 and 21, if I just use the first stream but filter it so that numbers in [22, 25] are rejected, I can build that stream too. Next, if I take this stream and filter it so that for each element x I output x % 7 + 1, I have a stream of uniform random numbers from 1 to 7! Quite simple, isn't it? :D – Paggas May 05 '09 at 06:14
1

@Eyal: +1, Great solution! While a 7% speed boost is not normally noteworthy, your solution has the property that, unlike mine, it never discards any randomness. My solution technically loses information in the case that i > 21. I'm afraid this comment box is too narrow to contain a detailed explanation. See also problem 2(a) (and its solution) of homework 1 at http://is.gd/x4gQ . – Adam Rosenfield May 06 '09 at 02:06
1

@Nixuz: I've said it before and I'll say it again: there is NOT a guaranteed worst case constant time solution that generates a perfectly uniform distribution, because 5^n is not divisible by 7 for any positive integer n. Pax's solution is _very close_ to uniform, but it is not _perfectly_ uniform. – Adam Rosenfield May 08 '09 at 03:14
I tested your algorithm in my harness and it does do marginally better at distribution (about a 2% improvement over 20 runs of 700000 samples). Your comments that my solution is a problem because there's no way to get from 0 to 6 in sequence is rubbish. ALL linear congruential methods have that problem and they seem to work fine. What matters is the distribution over a large enough set. If you're talking about true randomness (not something like linear congruential), that makes your solution provably incorrect since a sequence of 5,5,... ad infinitum is as equally likely as any other :-) – paxdiablo May 08 '09 at 03:57
@Pax: How is my comment about no way to get from 0 to 6 rubbish? There _is_ no way to get form 0 to 6 with yours, regardless of the source of the underlying rand5() function. The problem assumes a true RNG for rand5(), not a pseudorandom linear congruential generator. It's true the linear congruential generators have correlation problems (especially if you look at more than 2 consecutive outputs), but those are not the subject of this problem. (continuing...) – Adam Rosenfield May 08 '09 at 04:07
1

(continued...) If rand5() produces truly random outputs, with all sequences of a given length equally likely, then my rand7() will also be truly random, with all sequences of a given length equally likely. Yours will not. – Adam Rosenfield May 08 '09 at 04:08
If you're talking about a true RNG, then your solution is provably incorrect. Any input (sequence of numbers) that consistently stays above 21 in your calculation renders your code into an infinite loop. I would rather have a provably correct (in terms of execution time) algorithm with slightly worse distribution properties, especially as the question called, in part, best CPU speed. – paxdiablo May 08 '09 at 04:17
It boils down to what you want. You can either have a constant-time algorithm with near-perfect distribution or a variable-time algorithm with nearer-to-perfect distribution. I'm not confident that yours is perfect distribution either since it's effectively throwing away values from the rand5() distribution. It appears to be better based on sampling but that's not mathematical prrof. – paxdiablo May 08 '09 at 04:20
Yes, any input that consistently stays above 21 will cause an infinite loop, but such a sequence has 0 probability of occurring with a true RNG -- the limit of (4/25)^n as n goes to infinity is 0. But with a pseudo-RNG, there is a real danger of infinite looping. Of course, this could easily be solved by, say, looping at most a fixed number of times, which would then result in a non-uniform distribution. The degree of non-uniformity would depend on the maximum number of iterations. – Adam Rosenfield May 08 '09 at 04:25
7

And you're correct that it boils down to whether you want a perfect distribution with unbounded worst case runtime, or an imperfect distribution with a bounded runtime. This is a consequence of the fact that all powers 5 not divisible by 7, or equivalently if you have 5^n equally probably sequences of length n, there is no way to assign to each sequence a number from 1 to 7 such that each of 1..7 is equally probably. – Adam Rosenfield May 08 '09 at 04:27
The proof that my algorithm is correct is contained in the comments in the code. After assigning i in the loop, all values from 1 to 25 are equally probable. If the loop terminates, then all values from 1 to 21 are equally probable, since for any k in [1, 21], P(i=k given loop terminated) = P(i=k)/P(loop terminated) (by definition of conditional probability) = (1/25)/(21/25) = 1/21 for all k. Thus, after performing the final modulus, all 7 values are equally likely. Consecutive samples are clearly independent of each other, since there is no saved state between calls. – Adam Rosenfield May 08 '09 at 04:32
Err, that should be P(i=k given loop terminated) = P(i=k and loop terminated)/P(loop terminated) = (1/25)/(21/25) = 1/21 for 1 <= k <= 21, and it is 0 for 22 <= k <= 25, since P(i=k and loop terminated) is 0 for such k. – Adam Rosenfield May 08 '09 at 04:37
1

We are going have to agree to disagree on this one Adam. A true RNG does *not* have zero probability of generating billions of 5s in a row (or even an infinite number, for that matter), that's as likely as any other sequence. A pseudo RNG *does* have zero probability of doing that by virtue of the fact that it is deterministic. And I still don't believe you have a perfect distribution with exclusion since you're throwing away information from the sequence that does have perfect distribution. I've passed it onto some mathematicians at the state uni here to see what they say. Stay tuned. – paxdiablo May 08 '09 at 07:54
A true RNG does not have zero probability of generating a billion 5's in a row (that has probability 1/5^(10^9)). The probability of generating an infinite sequence of 5's is 0. I may be throwing away information (I'm using an average of 2.38 calls to rand5() per call to rand7(), whereas an optimal algorithm would use an average of log(7)/log(5) = 1.209 calls on average), but I still produce a uniform distribution. – Adam Rosenfield May 08 '09 at 14:35
I have to admit that this answer has the best accuracy, in 1 million tests there was a variance of no more than 500 between the numbers; other methods gave a variance as large as 100k. – Robert K Jun 01 '09 at 14:12
2

Why all the trouble? Won't this work?? int i; do { i = rand5() + rand5(); } while(i > 7); return i; Whats wrong with this?? – Lazer Sep 18 '09 at 11:10
3

@eSKay: No, that won't work, it produces a highly non-uniform distribution. It doesn't produce 1 at all, and it produces 2-7 with probabilities 1/15, 2/15, 3/15, 4/15, 5/15, and 6/15 respectively. – Adam Rosenfield Sep 18 '09 at 15:37
`But there is an infinitesimally small probability of looping forever.` ;-/ – Petr Peller May 12 '10 at 08:55
"// i is now uniformly random between 1 and 21" should be "// i is now ALMOST uniformly random between 1 and 21" – understack Aug 18 '10 at 11:22
Multiply by 4 instead of 5. Then numbers should be uniform between 1 and 21. Then while condition would not be required. – understack Aug 18 '10 at 11:34
3

@understack: No. If you multiply by 4 instead of 5, it is no longer uniform. There are then 2 ways to get 5 (4*0+5 = 4*1+1) but only 1 way to get 1 (4*0+1). – Adam Rosenfield Aug 20 '10 at 16:59
@Adam Rosenfield: that's true but I guess if you put condition like while (i>21), its not 'uniform' as well. technically, filtering out these cases means uniformity is disturbed. – understack Aug 21 '10 at 19:19
1

@understack: No, it _is_ uniform. Once the `while(i > 21)` loop terminates, all values between 1 and 21 are equally likely. Read up on rejection sampling http://en.wikipedia.org/wiki/Rejection_sampling . – Adam Rosenfield Aug 21 '10 at 20:27
1

Adam, could you please point me to a (mathematical) demonstration that there is no possible solution running in constant time? I don't really understand how "1/7 is an infinite decimal in base 5" proves that. – Jules Olléon Jan 30 '11 at 08:58
5

@Jules Olléon: Suppose there were a solution running in constant time that was guaranteed to make no more than `N` calls to `rand5()` in the worst case. Then, there are 5^N possible outcomes of the sequence of calls to `rand5`, each of which has an output of 1-7. So, if you add up all of the possible sequences of calls whose output is `k` for each 1≤k≤7, then the probability that the output is `k` is m/5^N, where m is the number of such sequences. So, m/5^N = 1/7, but there are no possible integer solutions (N,m) to this ==> contradiction. – Adam Rosenfield Jan 30 '11 at 19:45
3

@Jules Olléon: And the problem of solving m/5^N = 1/7 in the integers is equivalent to whether or not 1/7 is a terminating decimal in base 5. – Adam Rosenfield Jan 30 '11 at 19:47
4

@paxdiablo: You are incorrect. The chance of a true RNG generating an infinite sequence of 5's is exactly 0, using similar reasoning to the fact that [flipping a coin an infinite number of times is guaranteed not to generate an infinite number of consecutive heads](http://math.stackexchange.com/questions/607). This also means the chance of this code looping forever is exactly 0 (though there is a positive chance it will loop for any arbitrary number of iterations). – BlueRaja - Danny Pflughoeft May 23 '11 at 16:45
There is a real possibility of infinite loop. The solution is not elegant. – robermorales Sep 21 '11 at 07:24
can somebody please explain how "5 * (rand5() - 1) + rand5()" produces a uniformly random between 1 and 25 and "return i % 7 + 1;" produces a uniformly random between 1 and 7? Detailed explanation would be very helpful. Thanks. – ASKN Apr 28 '12 at 11:55
@AdamRosenfield : If you look at http://stackoverflow.com/questions/2509679/how-to-generate-a-random-number-from-within-a-range-c ..Don't you think the recursive function what Ryan Reich described,should be used, while narrowing '1..21' range to '1..7'? – Arvind Apr 12 '13 at 17:27
How did you figure out 5*(rand5()-1) ? Did you have to individually work out all possible cases to see they were equally likely? – user1775614 Dec 18 '13 at 00:16
@AdamRosenfield Can you please give some links/tutorial to understand the theory behind your solution specifically from where 5 * (rand5() - 1) + rand5() comes? etc. Thanks – MA1 Jun 01 '16 at 21:10
@BCS I think removing -1 will make the distribution uneven, as earlier each number from 1-21 contributed equally, after the mod. But if the numbers increase to 26 then the modulus values will be an uneven distribution. `22%7=1`, `23%7=2` till 26 will create uneven distribution for 1,2,3,4 and 5. – rd22 Jul 14 '16 at 17:47
If `rand5()` gives a uniform distribution from 1-5, why can't we use `rand5()*rand5()` instead of `5 * (rand5() - 1) + rand5()`? – PGT Jun 01 '17 at 04:27
@paxdiablo I'm truly curious as to what's the verdict by your mathematician friends (your last comment). Can you please update ? AFAIK distribution is equal to all [1,21] numbers hence this suggestion is indeed "perfect". But I'd love to be proved wrong (and explained why - so I could learn oc). – Nir Alfasi Oct 06 '17 at 07:32
@AdamRosenfield: Just thought of another way solving this, and shared that here: https://stackoverflow.com/a/47603332/3547167. Would appreciate any comments. – Shambho Dec 02 '17 at 00:31
An exact solution, if it interests you. – Shambho Dec 02 '17 at 00:49
@Shambho Wouldn't this contradict this [proof that there is no constant time solution](https://stackoverflow.com/questions/137783/expand-a-random-range-from-1-5-to-1-7?rq=1#comment5379908_137809)? So either must be incorrect. – pcworld Oct 20 '18 at 11:50

score 154 · Answer 3 · edited May 23 '17 at 11:55

I'd like to add another answer, in addition to my first answer. This answer attempts to minimize the number of calls to rand5() per call to rand7(), to maximize the usage of randomness. That is, if you consider randomness to be a precious resource, we want to use as much of it as possible, without throwing away any random bits. This answer also has some similarities with the logic presented in Ivan's answer.

The entropy of a random variable is a well-defined quantity. For a random variable which takes on N states with equal probabilities (a uniform distribution), the entropy is log₂ N. Thus, rand5() has approximately 2.32193 bits of entropy, and rand7() has about 2.80735 bits of entropy. If we hope to maximize our use of randomness, we need to use all 2.32193 bits of entropy from each call to rand5(), and apply them to generating 2.80735 bits of entropy needed for each call to rand7(). The fundamental limit, then, is that we can do no better than log(7)/log(5) = 1.20906 calls to rand5() per call to rand7().

Side notes: all logarithms in this answer will be base 2 unless specified otherwise. rand5() will be assumed to return numbers in the range [0, 4], and rand7() will be assumed to return numbers in the range [0, 6]. Adjusting the ranges to [1, 5] and [1, 7] respectively is trivial.

So how do we do it? We generate an infinitely precise random real number between 0 and 1 (pretend for the moment that we could actually compute and store such an infinitely precise number -- we'll fix this later). We can generate such a number by generating its digits in base 5: we pick the random number 0.a₁a₂a₃..., where each digit a_i is chosen by a call to rand5(). For example, if our RNG chose a_i = 1 for all i, then ignoring the fact that that isn't very random, that would correspond to the real number 1/5 + 1/5² + 1/5³ + ... = 1/4 (sum of a geometric series).

Ok, so we've picked a random real number between 0 and 1. I now claim that such a random number is uniformly distributed. Intuitively, this is easy to understand, since each digit was picked uniformly, and the number is infinitely precise. However, a formal proof of this is somewhat more involved, since now we're dealing with a continuous distribution instead of a discrete distribution, so we need to prove that the probability that our number lies in an interval [a, b] equals the length of that interval, b - a. The proof is left as an exercise for the reader =).

Now that we have a random real number selected uniformly from the range [0, 1], we need to convert it to a series of uniformly random numbers in the range [0, 6] to generate the output of rand7(). How do we do this? Just the reverse of what we just did -- we convert it to an infinitely precise decimal in base 7, and then each base 7 digit will correspond to one output of rand7().

Taking the example from earlier, if our rand5() produces an infinite stream of 1's, then our random real number will be 1/4. Converting 1/4 to base 7, we get the infinite decimal 0.15151515..., so we will produce as output 1, 5, 1, 5, 1, 5, etc.

Ok, so we have the main idea, but we have two problems left: we can't actually compute or store an infinitely precise real number, so how do we deal with only a finite portion of it? Secondly, how do we actually convert it to base 7?

One way we can convert a number between 0 and 1 to base 7 is as follows:

Multiply by 7
The integral part of the result is the next base 7 digit
Subtract off the integral part, leaving only the fractional part
Goto step 1

To deal with the problem of infinite precision, we compute a partial result, and we also store an upper bound on what the result could be. That is, suppose we've called rand5() twice and it returned 1 both times. The number we've generated so far is 0.11 (base 5). Whatever the rest of the infinite series of calls to rand5() produce, the random real number we're generating will never be larger than 0.12: it is always true that 0.11 ≤ 0.11xyz... < 0.12.

So, keeping track of the current number so far, and the maximum value it could ever take, we convert both numbers to base 7. If they agree on the first k digits, then we can safely output the next k digits -- regardless of what the infinite stream of base 5 digits are, they will never affect the next k digits of the base 7 representation!

And that's the algorithm -- to generate the next output of rand7(), we generate only as many digits of rand5() as we need to ensure that we know with certainty the value of the next digit in the conversion of the random real number to base 7. Here is a Python implementation, with a test harness:

import random

rand5_calls = 0
def rand5():
    global rand5_calls
    rand5_calls += 1
    return random.randint(0, 4)

def rand7_gen():
    state = 0
    pow5 = 1
    pow7 = 7
    while True:
        if state / pow5 == (state + pow7) / pow5:
            result = state / pow5
            state = (state - result * pow5) * 7
            pow7 *= 7
            yield result
        else:
            state = 5 * state + pow7 * rand5()
            pow5 *= 5

if __name__ == '__main__':
    r7 = rand7_gen()
    N = 10000
    x = list(next(r7) for i in range(N))
    distr = [x.count(i) for i in range(7)]
    expmean = N / 7.0
    expstddev = math.sqrt(N * (1.0/7.0) * (6.0/7.0))

    print '%d TRIALS' % N
    print 'Expected mean: %.1f' % expmean
    print 'Expected standard deviation: %.1f' % expstddev
    print
    print 'DISTRIBUTION:'
    for i in range(7):
        print '%d: %d   (%+.3f stddevs)' % (i, distr[i], (distr[i] - expmean) / expstddev)
    print
    print 'Calls to rand5: %d (average of %f per call to rand7)' % (rand5_calls, float(rand5_calls) / N)

Note that rand7_gen() returns a generator, since it has internal state involving the conversion of the number to base 7. The test harness calls next(r7) 10000 times to produce 10000 random numbers, and then it measures their distribution. Only integer math is used, so the results are exactly correct.

Also note that the numbers here get very big, very fast. Powers of 5 and 7 grow quickly. Hence, performance will start to degrade noticeably after generating lots of random numbers, due to bignum arithmetic. But remember here, my goal was to maximize the usage of random bits, not to maximize performance (although that is a secondary goal).

In one run of this, I made 12091 calls to rand5() for 10000 calls to rand7(), achieving the minimum of log(7)/log(5) calls on average to 4 significant figures, and the resulting output was uniform.

In order to port this code to a language that doesn't have arbitrarily large integers built-in, you'll have to cap the values of pow5 and pow7 to the maximum value of your native integral type -- if they get too big, then reset everything and start over. This will increase the average number of calls to rand5() per call to rand7() very slightly, but hopefully it shouldn't increase too much even for 32- or 64-bit integers.

+1 for a really interesting answer. Would it be possible, rather than resetting at a certain value, to simply shift off bits that have been used, and move the other bits up, and basically only keeping the bits that are going to be used? Or am I missing something? — Chris Lutz, May 21 '09 at 03:54
I'm not 100% sure, but I believe if you did that, you would skew the distribution ever so slightly (although I doubt that such skew would be measurable without trillions of trials). — Adam Rosenfield, May 21 '09 at 04:44
FTW! I tried to make the bignums smaller but it can't be done because no power of 5 has factors in common with a power of 7! Also, good use of the yield keyword. Very well done. — Eyal, Sep 02 '09 at 07:05
Very nice! Can we retain the extra entropy without growing state? The trick is to notice that both upper- and lower- bounds are at all times rational numbers. We can add, subtract, and multiply these without losing precision. If we do it all in base-35, we're nearly there. The remainder (multiplying by seven and retaining the fractional part) is left as an exercise. — Ian, Aug 14 '11 at 08:27
@adam You must refer to "cap the values of pow5 and pow7 to the maximum value of your native integral type". I second your believe that this will skew the distribution, at least if done naively. — catalyst, Sep 30 '11 at 10:49
This is really good and my favorite answer. It also has bounded running time. — Isaac, Mar 01 '14 at 08:18
@Isaac: No, this doesn't have bounded running time. No exactly correct answer can have bounded running time. — Adam Rosenfield, Mar 01 '14 at 18:16
As small as it is, with fixed number of digits, there is a probability of not getting a single random digit ever. — , Apr 13 '15 at 14:46

score 37 · Answer 4 · edited May 23 '17 at 12:26

37

(I have stolen Adam Rosenfeld's answer and made it run about 7% faster.)

Assume that rand5() returns one of {0,1,2,3,4} with equal distribution and the goal is return {0,1,2,3,4,5,6} with equal distribution.

int rand7() {
  i = 5 * rand5() + rand5();
  max = 25;
  //i is uniform among {0 ... max-1}
  while(i < max%7) {
    //i is uniform among {0 ... (max%7 - 1)}
    i *= 5;
    i += rand5(); //i is uniform {0 ... (((max%7)*5) - 1)}
    max %= 7;
    max *= 5; //once again, i is uniform among {0 ... max-1}
  }
  return(i%7);
}

We're keeping track of the largest value that the loop can make in the variable max. If the reult so far is between max%7 and max-1 then the result will be uniformly distrubuted in that range. If not, we use the remainder, which is random between 0 and max%7-1, and another call to rand() to make a new number and a new max. Then we start again.

Edit: Expect number of times to call rand5() is x in this equation:

x =  2     * 21/25
   + 3     *  4/25 * 14/20
   + 4     *  4/25 *  6/20 * 28/30
   + 5     *  4/25 *  6/20 *  2/30 * 7/10
   + 6     *  4/25 *  6/20 *  2/30 * 3/10 * 14/15
   + (6+x) *  4/25 *  6/20 *  2/30 * 3/10 *  1/15
x = about 2.21 calls to rand5()

edited May 23 '17 at 12:26

Community

1
1

answered Apr 30 '09 at 13:56

Eyal

5,728
7
43
70

2

Results cataloged in 1,000,000 tries: 1=47216; 2=127444; 3=141407; 4=221453; 5=127479; 6=167536; 7=167465. As you can see, distribution is lacking in respect to the odds of getting a 1. – Robert K Jun 01 '09 at 14:02
2

@The Wicked Flea: I think you're mistaken. Are you sure that the input rand5() you were using for your test produced 0-4 instead of 1-5, as specified in this solution? – Adam Rosenfield Jun 10 '09 at 00:38
5

adding uniformly distributed numbers does not result in a uniformly distributed number. In fact, you only need to sum 6 such uniformly distributed variables to get a reasonable approximation to a normal distribution. – Mitch Wheat Feb 15 '13 at 08:12
2

@MitchWheat - Adding two uniformly distributed integers does, in fact, result in a uniformly distributed random integer provided each possible sum can be generated in exactly one way. That happens to be the case in the expression `5 * rand5() + rand5()`. – Ted Hopp Jun 15 '15 at 13:06

score 30 · Answer 5 · edited Aug 08 '11 at 17:59

30

Algorithm:

7 can be represented in a sequence of 3 bits

Use rand(5) to randomly fill each bit with 0 or 1.
For e.g: call rand(5) and

if the result is 1 or 2, fill the bit with 0
if the result is 4 or 5, fill the bit with 1
if the result is 3 , then ignore and do it again (rejection)

This way we can fill 3 bits randomly with 0/1 and thus get a number from 1-7.

EDIT: This seems like the simplest and most efficient answer, so here's some code for it:

public static int random_7() {
    int returnValue = 0;
    while (returnValue == 0) {
        for (int i = 1; i <= 3; i++) {
            returnValue = (returnValue << 1) + random_5_output_2();
        }
    }
    return returnValue;
}

private static int random_5_output_2() {
    while (true) {
        int flip = random_5();

        if (flip < 3) {
            return 0;
        }
        else if (flip > 3) {
            return 1;
        }
    }
}

edited Aug 08 '11 at 17:59

Lance Roberts

22,383
32
112
130

answered Nov 14 '10 at 21:23

Anand

1
2
2

1

There always the faint spectre of the halting problem, since a poor random number generator could just generate a *lot* of threes at some point. – Alex North-Keys Apr 18 '12 at 13:31
"if the result is 1 or 2, fill the bit with 0 if the result is 4 or 5, fill the bit with 1" What is the logic by which 1,2,4,5 were accepted and 3 was rejected? Can you explain this? – gkns Dec 11 '13 at 09:38
@gkns There is no logic, you could have 1 and 2 mean fill with 0 bit and 3 and 4 mean fill with 1. The important thing is that each option has 50% chances of occurring, thus guaranteeing that the randomness of your function is at least as random as the original rand(5) function. Its a great solution! – Mo Beigi Apr 06 '15 at 08:24
This is neither simple nor efficient. The number of cals to random_5 per random_7 is at best 3 usually more. Other solutions on this page are closer to the actually best which is around 2.2. – Eyal Sep 03 '15 at 07:48
Doesn't this give you a random number between 0 and 7 as opposed to 1 and 7? – NicholasFolk Nov 11 '15 at 19:08
1

Nevermind, I missed the "while returnValue == 0" part – NicholasFolk Nov 11 '15 at 19:14

score 19 · Answer 6 · answered Sep 26 '08 at 05:03

19

int randbit( void )
{
    while( 1 )
    {
        int r = rand5();
        if( r <= 4 ) return(r & 1);
    }
}

int randint( int nbits )
{
    int result = 0;
    while( nbits-- )
    {
        result = (result<<1) | randbit();
    }
    return( result );
}

int rand7( void )
{
    while( 1 )
    {
        int r = randint( 3 ) + 1;
        if( r <= 7 ) return( r );
    }
}

answered Sep 26 '08 at 05:03

2

A correct solution, making an average of 30/7 = 4.29 calls to rand5() per call to rand7(). – Adam Rosenfield May 08 '09 at 03:30

BCS · Answer 7 · 2012-09-14T18:39:40.013

18

rand7() = (rand5()+rand5()+rand5()+rand5()+rand5()+rand5()+rand5())%7+1

Edit: That doesn't quite work. It's off by about 2 parts in 1000 (assuming a perfect rand5). The buckets get:

value   Count  Error%
1       11158  -0.0035
2       11144  -0.0214
3       11144  -0.0214
4       11158  -0.0035
5       11172  +0.0144
6       11177  +0.0208
7       11172  +0.0144

By switching to a sum of

n   Error%
10  +/- 1e-3,
12  +/- 1e-4,
14  +/- 1e-5,
16  +/- 1e-6,
...
28  +/- 3e-11

seems to gain an order of magnitude for every 2 added

BTW: the table of errors above was not generated via sampling but by the following recurrence relation:

p[x,n] is the number ways output=x can happen given n calls to rand5.

  p[1,1] ... p[5,1] = 1
  p[6,1] ... p[7,1] = 0

  p[1,n] = p[7,n-1] + p[6,n-1] + p[5,n-1] + p[4,n-1] + p[3,n-1]
  p[2,n] = p[1,n-1] + p[7,n-1] + p[6,n-1] + p[5,n-1] + p[4,n-1]
  p[3,n] = p[2,n-1] + p[1,n-1] + p[7,n-1] + p[6,n-1] + p[5,n-1]
  p[4,n] = p[3,n-1] + p[2,n-1] + p[1,n-1] + p[7,n-1] + p[6,n-1]
  p[5,n] = p[4,n-1] + p[3,n-1] + p[2,n-1] + p[1,n-1] + p[7,n-1]
  p[6,n] = p[5,n-1] + p[4,n-1] + p[3,n-1] + p[2,n-1] + p[1,n-1]
  p[7,n] = p[6,n-1] + p[5,n-1] + p[4,n-1] + p[3,n-1] + p[2,n-1]

edited Sep 14 '12 at 18:39

answered Sep 26 '08 at 04:51

BCS

75,627
68
187
294

9

This is not a uniform distribution. It's _very close_ to uniform, but not perfectly uniform. – Adam Rosenfield Jan 15 '09 at 18:06
Ah! Dice and 7's. If you are going to say I'm wrong, you shouldn't leave the proof as an exercise for the reader. – BCS Jan 25 '09 at 00:22
46

The proof that it's not uniform is simple: there are 5^7 possible ways the randomness can go, and as 5^7 is not a multiple of 7, it's not possible that all 7 sums are equally likely. (Basically, it boils down to 7 being relatively prime to 5, or equivalently 1/7 not being a terminating decimal in base 5.) In fact it's not even the "most uniform" possible under this constraint: direct computation shows that of the 5^7=78125 sums, the number of times you get values 1 to 7 is {1: 11145, 2: 11120, 3: 11120, 4: 11145, 5: 11190, 6: 11215, 7: 11190}. – ShreevatsaR Apr 30 '09 at 16:05
@ShreevatsaR So what if instead of taking the sum of rand5() seven times, we did it 5*7 takes - wouldn't that work? 35^7 % 7 = 35^5 % 7 = 0. – kba Jan 01 '12 at 18:08
5

@KristianAntonsen: How many ever times you do rand5(), you won't get a uniform distribution. If you do it N times, there are 5^N possible outputs, which is not divisible by 7. (If you do it 35 times, there are 5^35, not 35^7.) You'll get closer and closer to uniform the larger number of calls you use (and it can be any number, doesn't have to be divisible by 7), but IMHO instead of using a very large number of calls to rand(), you may as well use the probabilistic algorithm in the top answers, which gives an exact uniform distribution and whose expected number of calls to rand() is small. – ShreevatsaR Jan 02 '12 at 02:10
@engin that would make it neither uniform nor on the correct range. – BCS Jul 26 '13 at 16:23

score 16 · Answer 8 · answered Sep 26 '08 at 04:48

16

int ans = 0;
while (ans == 0) 
{
     for (int i=0; i<3; i++) 
     {
          while ((r = rand5()) == 3){};
          ans += (r < 3) >> i
     }
}

answered Sep 26 '08 at 04:48

Nescio

27,645
10
53
72

2

A correct solution, making an average of 30/7 = 4.29 calls to rand5() per call to rand7(). – Adam Rosenfield May 08 '09 at 04:12
4

Needs to be **left shift** for the algorithm to work : `ans += (r < 3) << i` – woolfie Jul 14 '16 at 17:25

score 13 · Answer 9 · edited Nov 11 '19 at 20:02

int rand7() {
    int value = rand5()
              + rand5() * 2
              + rand5() * 3
              + rand5() * 4
              + rand5() * 5
              + rand5() * 6;
    return value%7;
}

Unlike the chosen solution, the algorithm will run in constant time. It does however make 2 more calls to rand5 than the average run time of the chosen solution.

Note that this generator is not perfect (the number 0 has 0.0064% more chance than any other number), but for most practical purposes the guarantee of constant time probably outweighs this inaccuracy.

Explanation

This solution is derived from the fact that the number 15,624 is divisible by 7 and thus if we can randomly and uniformly generate numbers from 0 to 15,624 and then take mod 7 we can get a near-uniform rand7 generator. Numbers from 0 to 15,624 can be uniformly generated by rolling rand5 6 times and using them to form the digits of a base 5 number as follows:

rand5 * 5^5 + rand5 * 5^4 + rand5 * 5^3 + rand5 * 5^2 + rand5 * 5 + rand5

Properties of mod 7 however allow us to simplify the equation a bit:

5^5 = 3 mod 7
5^4 = 2 mod 7
5^3 = 6 mod 7
5^2 = 4 mod 7
5^1 = 5 mod 7

So

rand5 * 5^5 + rand5 * 5^4 + rand5 * 5^3 + rand5 * 5^2 + rand5 * 5 + rand5

becomes

rand5 * 3 + rand5 * 2 + rand5 * 6 + rand5 * 4 + rand5 * 5 + rand5

Theory

The number 15,624 was not chosen randomly, but can be discovered using fermat's little theorem, which states that if p is a prime number then

a^(p-1) = 1 mod p

So this gives us,

(5^6)-1 = 0 mod 7

(5^6)-1 is equal to

4 * 5^5 + 4 * 5^4 + 4 * 5^3 + 4 * 5^2 + 4 * 5 + 4

This is a number in base 5 form and thus we can see that this method can be used to go from any random number generator to any other random number generator. Though a small bias towards 0 is always introduced when using the exponent p-1.

To generalize this approach and to be more accurate we can have a function like this:

def getRandomconverted(frm, to):
    s = 0
    for i in range(to):
        s += getRandomUniform(frm)*frm**i
    mx = 0
    for i in range(to):
        mx = (to-1)*frm**i 
    mx = int(mx/to)*to # maximum value till which we can take mod
    if s < mx:
        return s%to
    else:
        return getRandomconverted(frm, to)

This generator is accurate, but _not_ perfectly uniform. To see this, consider the fact that a uniform generator in [0,15624] has 15625 possible outcomes, which isn't divisible by 7. This introduces a bias to the number 0 (which has 2233/15625 chance, and the others just 2232/15625). After all, while using Fermat's little theorem might seem correct at first glance, it says that (5^6)%7=1, and not (5^6)%7=0. The latter is obviously impossible for any exponent because 5 and 7 are both prime numbers. I think it's still an acceptable solution, and I've edited your post to reflect this. — aviator, Jun 04 '17 at 11:20

score 13 · Answer 10 · answered Jan 15 '09 at 04:18

13

The following produces a uniform distribution on {1, 2, 3, 4, 5, 6, 7} using a random number generator producing a uniform distribution on {1, 2, 3, 4, 5}. The code is messy, but the logic is clear.

public static int random_7(Random rg) {
    int returnValue = 0;
    while (returnValue == 0) {
        for (int i = 1; i <= 3; i++) {
            returnValue = (returnValue << 1) + SimulateFairCoin(rg);
        }
    }
    return returnValue;
}

private static int SimulateFairCoin(Random rg) {
    while (true) {
        int flipOne = random_5_mod_2(rg);
        int flipTwo = random_5_mod_2(rg);

        if (flipOne == 0 && flipTwo == 1) {
            return 0;
        }
        else if (flipOne == 1 && flipTwo == 0) {
            return 1;
        }
    }
}

private static int random_5_mod_2(Random rg) {
    return random_5(rg) % 2;
}

private static int random_5(Random rg) {
    return rg.Next(5) + 1;
}

answered Jan 15 '09 at 04:18

jason

236,483
35
423
525

2

A correct solution (which puts you way ahead of the curve), although not very efficient. This makes an average of 25/6 = 4.17 calls to random_5_mod_2 per fair coin flip, for a total average of 100/7 = 14.3 calls to random_5() per call to random_7(). – Adam Rosenfield May 08 '09 at 03:28
The advantage of this solution over the others is that it can be easily expanded to produce any other uniformly distributed range. Just randomly select each one of the bits, re-rolling on invalid values (like the 0 value in our current solution that produces 8 numbers). – DenTheMan Jan 16 '11 at 03:51
1

possible infinite loops, etc. – robermorales Sep 21 '11 at 07:28
1

@robermorales: Extremely unlikely. – jason Sep 21 '11 at 12:58

score 12 · Answer 11 · edited May 12 '10 at 08:46

12

Are homework problems allowed here?

This function does crude "base 5" math to generate a number between 0 and 6.

function rnd7() {
    do {
        r1 = rnd5() - 1;
        do {
            r2=rnd5() - 1;
        } while (r2 > 1);
        result = r2 * 5 + r1;
    } while (result > 6);
    return result + 1;
}

edited May 12 '10 at 08:46

Anonymoose

5,662
4
33
41

answered Sep 26 '08 at 04:41

Will Hartung

115,893
19
128
203

3

A correct solution (which puts you way ahead of the curve), although not very efficient. This makes an average of 5 calls to rnd5() for each call to rnd7(). – Adam Rosenfield May 08 '09 at 03:24
need some more explanation pls – Barry Dec 14 '11 at 03:19
1

@Barry - First, you can't just add two random numbers together, you don't get a linear solution (consider a pair of dice). Now consider "Base 5": 00, 01, 02, 03, 04, 10, 11. That 0-6 in base 5. So, we simply need to generate 2 digits of the base 5 number, and add them up until we get one that's within the range. That's what the r2*5+r1 does. The r2 > 1 loop is there because we would never want a high digit of > 1. – Will Hartung Dec 14 '11 at 04:12
This solution does not generate a uniform distribution. The numbers 1 and 7 can only be generated in one way, but 2 through 6 can each be generated in two ways: with r1 equal to the number minus 1 and r2 equal 0 or with r1 equal to the number minus 2 and r2 equal to 1. Thus 2 through 6 will be returned on average twice as often as 1 or 7. – Ted Hopp Jun 15 '15 at 12:53

score 12 · Answer 12 · edited Jun 17 '11 at 05:35

If we consider the additional constraint of trying to give the most efficient answer i.e one that given an input stream, I, of uniformly distributed integers of length m from 1-5 outputs a stream O, of uniformly distributed integers from 1-7 of the longest length relative to m, say L(m).

The simplest way to analyse this is to treat the streams I and O as 5-ary and 7-ary numbers respectively. This is achieved by the main answer's idea of taking the stream a1, a2, a3,... -> a1+5*a2+5^2*a3+.. and similarly for stream O.

Then if we take a section of the input stream of length m choose n s.t. 5^m-7^n=c where c>0 and is as small as possible. Then there is a uniform map from the input stream of length m to integers from 1 to 5^m and another uniform map from integers from 1 to 7^n to the output stream of length n where we may have to lose a few cases from the input stream when the mapped integer exceeds 7^n.

So this gives a value for L(m) of around m (log5/log7) which is approximately .82m.

The difficulty with the above analysis is the equation 5^m-7^n=c which is not easy to solve exactly and the case where the uniform value from 1 to 5^m exceeds 7^n and we lose efficiency.

The question is how close to the best possible value of m (log5/log7) can be attain. For example when this number approaches close to an integer can we find a way to achieve this exact integral number of output values?

If 5^m-7^n=c then from the input stream we effectively generate a uniform random number from 0 to (5^m)-1 and don't use any values higher than 7^n. However these values can be rescued and used again. They effectively generate a uniform sequence of numbers from 1 to 5^m-7^n. So we can then try to use these and convert them into 7-ary numbers so that we can create more output values.

If we let T7(X) to be the average length of the output sequence of random(1-7) integers derived from a uniform input of size X, and assuming that 5^m=7^n0+7^n1+7^n2+...+7^nr+s, s<7.

Then T7(5^m)=n0x7^n0/5^m + ((5^m-7^n0)/5^m) T7(5^m-7^n0) since we have a length no sequence with probability 7^n0/5^m with a residual of length 5^m-7^n0 with probability (5^m-7^n0)/5^m).

If we just keep substituting we obtain:

T7(5^m) = n0x7^n0/5^m + n1x7^n1/5^m + ... + nrx7^nr/5^m  = (n0x7^n0 + n1x7^n1 + ... + nrx7^nr)/5^m

Hence

L(m)=T7(5^m)=(n0x7^n0 + n1x7^n1 + ... + nrx7^nr)/(7^n0+7^n1+7^n2+...+7^nr+s)

Another way of putting this is:

If 5^m has 7-ary representation `a0+a1*7 + a2*7^2 + a3*7^3+...+ar*7^r
Then L(m) = (a1*7 + 2a2*7^2 + 3a3*7^3+...+rar*7^r)/(a0+a1*7 + a2*7^2 + a3*7^3+...+ar*7^r)

The best possible case is my original one above where 5^m=7^n+s, where s<7.

Then T7(5^m) = nx(7^n)/(7^n+s) = n+o(1) = m (Log5/Log7)+o(1) as before.

The worst case is when we can only find k and s.t 5^m = kx7+s.

Then T7(5^m) = 1x(k.7)/(k.7+s) = 1+o(1)

Other cases are somewhere inbetween. It would be interesting to see how well we can do for very large m, i.e. how good can we get the error term:

T7(5^m) = m (Log5/Log7)+e(m)

It seems impossible to achieve e(m) = o(1) in general but hopefully we can prove e(m)=o(m).

The whole thing then rests on the distribution of the 7-ary digits of 5^m for various values of m.

I'm sure there is a lot of theory out there that covers this I may have a look and report back at some point.

+2 (if I could)--this was the only good answer (as opposed to merely adequate). You've got the second best answer that will fit in 32 bit integers. — Rex Kerr, Mar 10 '10 at 19:39

score 10 · Answer 13 · edited May 23 '17 at 11:55

10

Here is a working Python implementation of Adam's answer.

import random

def rand5():
    return random.randint(1, 5)

def rand7():
    while True:
        r = 5 * (rand5() - 1) + rand5()
        #r is now uniformly random between 1 and 25
        if (r <= 21):
            break
    #result is now uniformly random between 1 and 7
    return r % 7 + 1

I like to throw algorithms I'm looking at into Python so I can play around with them, thought I'd post it here in the hopes that it is useful to someone out there, not that it took long to throw together.

edited May 23 '17 at 11:55

Community

1
1

answered May 04 '09 at 04:57

James McMahon

48,506
64
207
283

No, that is quite dissimilar from my answer. You're looping 21 times and discarding the first 20 iterations' results. You're also using a rand4() and a rand5() as input, which quite obviously breaks the rules of using only rand5(). Finally, you produce a non-uniform distribution. – Adam Rosenfield May 05 '09 at 13:28
Sorry about that. I was pretty tired when I looked this question over, tired enough that I completely misread your algorithm. I actually threw it into Python because I couldn't understand why you were looping 21 times. Makes a lot more sense now. I did the random.randint(1, 4) thing as a shorthand but I guess you are correct, it is against the spirit of the question. I've corrected the code. – James McMahon May 06 '09 at 00:12
@robermorales - As Adam Rosenfeld explained in [his answer](https://stackoverflow.com/a/137809/535871), every solution that gives a true uniform distribution on [1, 7] will involve some sort of accept-reject loop that is potentially infinite. (However, if `rand5()` is a decent PRNG, then the loop will not be infinite because eventually `5*(rand5() - 1) + rand5()` will definitely be <= 21.) – Ted Hopp Jan 31 '19 at 03:02

score 9 · Answer 14 · answered Nov 09 '09 at 12:11

9

Why not do it simple?

int random7() {
  return random5() + (random5() % 3);
}

The chances of getting 1 and 7 in this solution is lower due to the modulo, however, if you just want a quick and readable solution, this is the way to go.

answered Nov 09 '09 at 12:11

Ante

11
1
2

14

This does not produce a uniform distribution. This produces the numbers 0-6 with probabilities 2/25, 4/25, 5/25, 5/25, 5/25, 3/25, 1/25, as can be verified by counting all 25 possible outcomes. – Adam Rosenfield Dec 05 '09 at 03:40

Dinah · Answer 15 · 2009-06-23T16:51:30.500

8

The premise behind Adam Rosenfield's correct answer is:

x = 5^n (in his case: n=2)
manipulate n rand5 calls to get a number y within range [1, x]
z = ((int)(x / 7)) * 7
if y > z, try again. else return y % 7 + 1

When n equals 2, you have 4 throw-away possibilities: y = {22, 23, 24, 25}. If you use n equals 6, you only have 1 throw-away: y = {15625}.

5^6 = 15625
7 * 2232 = 15624

You call rand5 more times. However, you have a much lower chance of getting a throw-away value (or an infinite loop). If there is a way to get no possible throw-away value for y, I haven't found it yet.

edited Jun 23 '09 at 16:51

answered Jun 23 '09 at 14:17

Dinah

52,922
30
133
149

1

There is provably no case without throwaway values--if there was no throwaway, 5^n and 7^m would have a factor in common. But they're (powers of) primes, so they don't. – Rex Kerr Mar 10 '10 at 19:28

Chris Suter · Answer 16 · 2009-09-09T07:23:24.420

Here's my answer:

static struct rand_buffer {
  unsigned v, count;
} buf2, buf3;

void push (struct rand_buffer *buf, unsigned n, unsigned v)
{
  buf->v = buf->v * n + v;
  ++buf->count;
}

#define PUSH(n, v)  push (&buf##n, n, v)

int rand16 (void)
{
  int v = buf2.v & 0xf;
  buf2.v >>= 4;
  buf2.count -= 4;
  return v;
}

int rand9 (void)
{
  int v = buf3.v % 9;
  buf3.v /= 9;
  buf3.count -= 2;
  return v;
}

int rand7 (void)
{
  if (buf3.count >= 2) {
    int v = rand9 ();

    if (v < 7)
      return v % 7 + 1;

    PUSH (2, v - 7);
  }

  for (;;) {
    if (buf2.count >= 4) {
      int v = rand16 ();

      if (v < 14) {
        PUSH (2, v / 7);
        return v % 7 + 1;
      }

      PUSH (2, v - 14);
    }

    // Get a number between 0 & 25
    int v = 5 * (rand5 () - 1) + rand5 () - 1;

    if (v < 21) {
      PUSH (3, v / 7);
      return v % 7 + 1;
    }

    v -= 21;
    PUSH (2, v & 1);
    PUSH (2, v >> 1);
  }
}

It's a little more complicated than others, but I believe it minimises the calls to rand5. As with other solutions, there's a small probability that it could loop for a long time.

This produces a distribution not much different from the other solutions but has the added disadvantage of being needlessly complex. It also suffers from the provably incorrect non-deterministic loop-forever possibility if the numbers are truly random. I still think the ones that produce a slightly less uniform distribution (though still far more than adequate) but guarantee deterministic behavior are better. — paxdiablo, Sep 09 '09 at 05:37
@Pax: Please enlighten me as to how this produces a non-uniform distribution. My analysis of the code, as well as my own testing, indicates that this produces a uniform distribution. As we've previously discussed, it's impossible to both produce a perfectly uniform distribution and have a guaranteed constant time upper bound of the running time. — Adam Rosenfield, Sep 18 '09 at 15:53

score 8 · Answer 17 · answered Apr 30 '09 at 09:39

Assuming that rand(n) here means "random integer in a uniform distribution from 0 to n-1", here's a code sample using Python's randint, which has that effect. It uses only randint(5), and constants, to produce the effect of randint(7). A little silly, actually

from random import randint
sum = 7
while sum >= 7:
    first = randint(0,5)   
    toadd = 9999
    while toadd>1:
        toadd = randint(0,5)
    if toadd:
        sum = first+5
    else:
        sum = first

assert 7>sum>=0 
print sum

@robermorales Because Python doesn't have `do ... while`. It could have been `1337`, or `12345`, or any number > 1. — tckmn, Jul 06 '14 at 19:43

score 7 · Answer 18 · edited May 23 '17 at 11:33

7

Simple and efficient:

int rand7 ( void )
{
    return 4; // this number has been calculated using
              // rand5() and is in the range 1..7
}

(Inspired by What's your favorite "programmer" cartoon?).

edited May 23 '17 at 11:33

Community

1
1

answered Aug 23 '10 at 13:14

chiccodoro

14,407
19
87
130

score 6 · Answer 19 · answered Sep 21 '10 at 15:45

6

I don't like ranges starting from 1, so I'll start from 0 :-)

unsigned rand5()
{
    return rand() % 5;
}

unsigned rand7()
{
    int r;

    do
    {
        r =         rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
        r = r * 5 + rand5();
    } while (r > 15623);

    return r / 2232;
}

answered Sep 21 '10 at 15:45

fredoverflow

256,549
94
388
662

This is a winner. This produces all 7 outcomes with equal probability. `from collections import defaultdict def r7(n): if not n: yield [] else: for i in range(1, 6): for j in r7(n-1): yield [i] + j def test_r7(): d = defaultdict(int) for x in r7(6): s = (((((((((x[5] * 5) + x[4]) * 5) + x[3]) * 5) + x[2]) * 5) + x[1]) * 5) + x[0] if s <= 15623: d[s % 7] += 1 print d ` – hughdbrown Dec 01 '10 at 18:34

score 5 · Answer 20 · answered Dec 04 '09 at 23:45

5

As long as there aren't seven possibilities left to choose from, draw another random number, which multiplies the number of possibilities by five. In Perl:

$num = 0;
$possibilities = 1;

sub rand7
{
  while( $possibilities < 7 )
  {
    $num = $num * 5 + int(rand(5));
    $possibilities *= 5;
  }
  my $result = $num % 7;
  $num = int( $num / 7 );
  $possibilities /= 7;
  return $result;
}

answered Dec 04 '09 at 23:45

your distribution is not uniform, at least on the first call. Indeed, `$possibilities` has always to grow to 25 to exit the loop and return. So, your first result is `[0-124] % 7`, which is not uniformly distributed because `125 % 7 != 0` (this is 6, actually). – bernard paulus Jan 31 '13 at 16:28

philcolbourn · Answer 21 · 2010-04-19T12:21:57.623

I know it has been answered, but is this seems to work ok, but I can not tell you if it has a bias. My 'testing' suggests it is, at least, reasonable.

Perhaps Adam Rosenfield would be kind enough to comment?

My (naive?) idea is this:

Accumulate rand5's until there is enough random bits to make a rand7. This takes at most 2 rand5's. To get the rand7 number I use the accumulated value mod 7.

To avoid the accumulator overflowing, and since the accumulator is mod 7 then I take the mod 7 of the accumulator:

(5a + rand5) % 7 = (k*7 + (5a%7) + rand5) % 7 = ( (5a%7) + rand5) % 7

The rand7() function follows:

(I let the range of rand5 be 0-4 and rand7 is likewise 0-6.)

int rand7(){
  static int    a=0;
  static int    e=0;
  int       r;
  a = a * 5 + rand5();
  e = e + 5;        // added 5/7ths of a rand7 number
  if ( e<7 ){
    a = a * 5 + rand5();
    e = e + 5;  // another 5/7ths
  }
  r = a % 7;
  e = e - 7;        // removed a rand7 number
  a = a % 7;
  return r;
}

Edit: Added results for 100 million trials.

'Real' rand functions mod 5 or 7

rand5 : avg=1.999802 0:20003944 1:19999889 2:20003690 3:19996938 4:19995539 rand7 : avg=3.000111 0:14282851 1:14282879 2:14284554 3:14288546 4:14292388 5:14288736 6:14280046

My rand7

Average looks ok and number distributions look ok too.

randt : avg=3.000080 0:14288793 1:14280135 2:14287848 3:14285277 4:14286341 5:14278663 6:14292943

You should probably look at sequential correlation. I think if you take successive pairs (each "random" number paired with its predecessor) then you might find surprising things. You haven't explained WHY it should keep the distribution uniform, at any rate. A working program normally should start with an explanation of why it works. — Ian, Aug 14 '11 at 08:06
Would sequential correlation apply to many of these solutions? — philcolbourn, Aug 29 '11 at 07:38
Would sequential correlation apply to many of these solutions? It has been a while since I attempted this and I thought I explained it. Looking at it now, it looks like I am accumulating random bits in a pool from rand5, ensuring enough have been accumulated before withdrawing enough to make a rand7 number and ensuring I don't overflow my accumulator. — philcolbourn, Aug 29 '11 at 07:44

score 4 · Answer 22 · answered Sep 17 '09 at 23:20

There are elegant algorithms cited above, but here's one way to approach it, although it might be roundabout. I am assuming values generated from 0.

R2 = random number generator giving values less than 2 (sample space = {0, 1})
R8 = random number generator giving values less than 8 (sample space = {0, 1, 2, 3, 4, 5, 6, 7})

In order to generate R8 from R2, you will run R2 thrice, and use the combined result of all 3 runs as a binary number with 3 digits. Here are the range of values when R2 is ran thrice:

0 0 0 --> 0
.
.
1 1 1 --> 7

Now to generate R7 from R8, we simply run R7 again if it returns 7:

int R7() {
  do {
    x = R8();
  } while (x > 6)
  return x;
}

The roundabout solution is to generate R2 from R5 (just like we generated R7 from R8), then R8 from R2 and then R7 from R8.

like a number of others, this approach could take an arbitrarily long time per R7 call, since you could get a long string of sevens from R8. — Alex North-Keys, Apr 18 '12 at 13:57

score 4 · Answer 23 · answered Dec 29 '09 at 01:46

4

There you go, uniform distribution and zero rand5 calls.

def rand7:
    seed += 1
    if seed >= 7:
        seed = 0
    yield seed

Need to set seed beforehand.

answered Dec 29 '09 at 01:46

Kugel

19,354
16
71
103

Uniform, but not a random variable – ldmtwo Mar 15 '22 at 01:46

score 4 · Answer 24 · answered Mar 10 '10 at 21:25

Here's a solution that fits entirely within integers and is within about 4% of optimal (i.e. uses 1.26 random numbers in {0..4} for every one in {0..6}). The code's in Scala, but the math should be reasonably clear in any language: you take advantage of the fact that 7^9 + 7^8 is very close to 5^11. So you pick an 11 digit number in base 5, and then interpret it as a 9 digit number in base 7 if it's in range (giving 9 base 7 numbers), or as an 8 digit number if it's over the 9 digit number, etc.:

abstract class RNG {
  def apply(): Int
}

class Random5 extends RNG {
  val rng = new scala.util.Random
  var count = 0
  def apply() = { count += 1 ; rng.nextInt(5) }
}

class FiveSevener(five: RNG) {
  val sevens = new Array[Int](9)
  var nsevens = 0
  val to9 = 40353607;
  val to8 = 5764801;
  val to7 = 823543;
  def loadSevens(value: Int, count: Int) {
    nsevens = 0;
    var remaining = value;
    while (nsevens < count) {
      sevens(nsevens) = remaining % 7
      remaining /= 7
      nsevens += 1
    }
  }
  def loadSevens {
    var fivepow11 = 0;
    var i=0
    while (i<11) { i+=1 ; fivepow11 = five() + fivepow11*5 }
    if (fivepow11 < to9) { loadSevens(fivepow11 , 9) ; return }
    fivepow11 -= to9
    if (fivepow11 < to8) { loadSevens(fivepow11 , 8) ; return }
    fivepow11 -= to8
    if (fivepow11 < 3*to7) loadSevens(fivepow11 % to7 , 7)
    else loadSevens
  }
  def apply() = {
    if (nsevens==0) loadSevens
    nsevens -= 1
    sevens(nsevens)
  }
}

If you paste a test into the interpreter (REPL actually), you get:

scala> val five = new Random5
five: Random5 = Random5@e9c592

scala> val seven = new FiveSevener(five)
seven: FiveSevener = FiveSevener@143c423

scala> val counts = new Array[Int](7)
counts: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0)

scala> var i=0 ; while (i < 100000000) { counts( seven() ) += 1 ; i += 1 }
i: Int = 100000000

scala> counts
res0: Array[Int] = Array(14280662, 14293012, 14281286, 14284836, 14287188,
14289332, 14283684)

scala> five.count
res1: Int = 125902876

The distribution is nice and flat (within about 10k of 1/7 of 10^8 in each bin, as expected from an approximately-Gaussian distribution).

score 3 · Answer 25 · answered May 27 '10 at 08:20

3

just scale your output from your first function

0) you have a number in range 1-5
1) subtract 1 to make it in range 0-4
2) multiply by (7-1)/(5-1) to make it in range 0-6
3) add 1 to increment the range: Now your result is in between 1-7

answered May 27 '10 at 08:20

cartonn

7,654
3
19
19

5

Sorry, this would only work if you are working with real numbers or doubles etc... Randomizing is a tricky subject! – cartonn May 27 '10 at 08:22
At step (1), you have 5 distinct values. Step (2) expands the range but does not increase the nu8mber of distinct values, so you still have only 5 values at the end. – hughdbrown Dec 01 '10 at 18:13

score 3 · Answer 26 · edited Nov 30 '11 at 17:07

3

extern int r5();

int r7() {
    return ((r5() & 0x01) << 2 ) | ((r5() & 0x01) << 1 ) | (r5() & 0x01);
}

edited Nov 30 '11 at 17:07

CoolBeans

20,654
10
86
101

answered Dec 03 '10 at 01:31

maxchengcn

1
2

problem: this returns non-uniformly in range 0-7, not 0-6. Indeed, you can have `7 = 111b` with `p(7) = 8 / 125` – bernard paulus Jan 31 '13 at 02:10

Shambho · Answer 27 · 2017-12-02T02:25:21.177

I think I have four answers, two giving exact solutions like that of @Adam Rosenfield but without the infinite loop problem, and other two with almost perfect solution but faster implementation than first one.

The best exact solution requires 7 calls to rand5, but lets proceed in order to understand.

Method 1 - Exact

Strength of Adam's answer is that it gives a perfect uniform distribution, and there is very high probability (21/25) that only two calls to rand5() will be needed. However, worst case is infinite loop.

The first solution below also gives a perfect uniform distribution, but requires a total of 42 calls to rand5. No infinite loops.

Here is an R implementation:

rand5 <- function() sample(1:5,1)

rand7 <- function()  (sum(sapply(0:6, function(i) i + rand5() + rand5()*2 + rand5()*3 + rand5()*4 + rand5()*5 + rand5()*6)) %% 7) + 1

For people not familiar with R, here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 0:6){
    r = r + i + rand5() + rand5()*2 + rand5()*3 + rand5()*4 + rand5()*5 + rand5()*6
  }
  return r %% 7 + 1
}

The distribution of rand5 will be preserved. If we do the math, each of the 7 iterations of the loop has 5^6 possible combinations, thus total number of possible combinations are (7 * 5^6) %% 7 = 0. Thus we can divide the random numbers generated in equal groups of 7. See method two for more discussion on this.

Here are all the possible combinations:

table(apply(expand.grid(c(outer(1:5,0:6,"+")),(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6),1,sum) %% 7 + 1)

    1     2     3     4     5     6     7 
15625 15625 15625 15625 15625 15625 15625

I think it's straight forward to show that Adam's method will run much much faster. The probability that there are 42 or more calls to rand5 in Adam's solution is very small ((4/25)^21 ~ 10^(-17)).

Method 2 - Not Exact

Now the second method, which is almost uniform, but requires 6 calls to rand5:

rand7 <- function() (sum(sapply(1:6,function(i) i*rand5())) %% 7) + 1

Here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:6){
    r = r + i*rand5()
  }
  return r %% 7 + 1
}

This is essentially one iteration of method 1. If we generate all possible combinations, here is resulting counts:

table(apply(expand.grid(1:5,(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6),1,sum) %% 7 + 1)

   1    2    3    4    5    6    7 
2233 2232 2232 2232 2232 2232 2232

One number will appear once more in 5^6 = 15625 trials.

Now, in Method 1, by adding 1 to 6, we move the number 2233 to each of the successive point. Thus the total number of combinations will match up. This works because 5^6 %% 7 = 1, and then we do 7 appropriate variations, so (7 * 5^6 %% 7 = 0).

Method 3 - Exact

If the argument of method 1 and 2 is understood, method 3 follows, and requires only 7 calls to rand5. At this point, I feel this is the minimum number of calls needed for an exact solution.

Here is an R implementation:

rand5 <- function() sample(1:5,1)

rand7 <- function()  (sum(sapply(1:7, function(i) i * rand5())) %% 7) + 1

For people not familiar with R, here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:7){
    r = r + i * rand5()
  }
  return r %% 7 + 1
}

The distribution of rand5 will be preserved. If we do the math, each of the 7 iterations of the loop has 5 possible outcomes, thus total number of possible combinations are (7 * 5) %% 7 = 0. Thus we can divide the random numbers generated in equal groups of 7. See method one and two for more discussion on this.

Here are all the possible combinations:

table(apply(expand.grid(0:6,(1:5)),1,sum) %% 7 + 1)

1 2 3 4 5 6 7  
5 5 5 5 5 5 5

I think it's straight forward to show that Adam's method will still run faster. The probability that there are 7 or more calls to rand5 in Adam's solution is still small ((4/25)^3 ~ 0.004).

Method 4 - Not Exact

This is a minor variation of the the second method. It is almost uniform, but requires 7 calls to rand5, that is one additional to method 2:

rand7 <- function() (rand5() + sum(sapply(1:6,function(i) i*rand5())) %% 7) + 1

Here is a simplified version:

rand7 = function(){
  r = 0 
  for(i in 1:6){
    r = r + i*rand5()
  }
  return (r+rand5()) %% 7 + 1
}

If we generate all possible combinations, here is resulting counts:

table(apply(expand.grid(1:5,(1:5)*2,(1:5)*3,(1:5)*4,(1:5)*5,(1:5)*6,1:5),1,sum) %% 7 + 1)

    1     2     3     4     5     6     7 
11160 11161 11161 11161 11161 11161 11160

Two numbers will appear once less in 5^7 = 78125 trials. For most purposes, I can live with that.

I'm not familiar with R, but unless I'm misunderstanding how these work, then method 1 is not exact. It has (5^6)^7 = 5^42 possible outcomes, not (5^6)*7; 5^42 is not divisible by 7. Likewise method 3 is not exact. It has 5^7 possible outcomes, not 5*7. (The last loop iteration in method 3 with `i=7` also has no effect, since adding `7*rand5()` to `r` does not change the value of `r` mod 7.) — Adam Rosenfield, Jan 31 '18 at 22:32

score 3 · Answer 28 · answered Apr 01 '11 at 04:06

in php

function rand1to7() {
    do {
        $output_value = 0;
        for ($i = 0; $i < 28; $i++) {
            $output_value += rand1to5();
        }
    while ($output_value != 140);
    $output_value -= 12;
    return floor($output_value / 16);
}

loops to produce a random number between 16 and 127, divides by sixteen to create a float between 1 and 7.9375, then rounds down to get an int between 1 and 7. if I am not mistaken, there is a 16/112 chance of getting any one of the 7 outcomes.

although there is probably an easier answer similar to this using no conditional loop, and modulo instead of floor. i just can't crunch the numbers right now. — dqhendricks, Apr 01 '11 at 15:09

paxdiablo · Answer 29 · 2009-05-08T03:03:29.660

3

By using a rolling total, you can both

maintain an equal distribution; and
not have to sacrifice any element in the random sequence.

Both these problems are an issue with the simplistic rand(5)+rand(5)...-type solutions. The following Python code shows how to implement it (most of this is proving the distribution).

import random
x = []
for i in range (0,7):
    x.append (0)
t = 0
tt = 0
for i in range (0,700000):
    ########################################
    #####            qq.py             #####
    r = int (random.random () * 5)
    t = (t + r) % 7
    ########################################
    #####       qq_notsogood.py        #####
    #r = 20
    #while r > 6:
        #r =     int (random.random () * 5)
        #r = r + int (random.random () * 5)
    #t = r
    ########################################
    x[t] = x[t] + 1
    tt = tt + 1
high = x[0]
low = x[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, x[i], 100.0 * x[i] / tt)
    if x[i] < low:
        low = x[i]
    if x[i] > high:
        high = x[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / tt)

And this output shows the results:

pax$ python qq.py
0:   99908 14.27257
1:  100029 14.28986
2:  100327 14.33243
3:  100395 14.34214
4:   99104 14.15771
5:   99829 14.26129
6:  100408 14.34400
Variation = 1304 (0.18629%)

pax$ python qq.py
0:   99547 14.22100
1:  100229 14.31843
2:  100078 14.29686
3:   99451 14.20729
4:  100284 14.32629
5:  100038 14.29114
6:  100373 14.33900
Variation = 922 (0.13171%)

pax$ python qq.py
0:  100481 14.35443
1:   99188 14.16971
2:  100284 14.32629
3:  100222 14.31743
4:   99960 14.28000
5:   99426 14.20371
6:  100439 14.34843
Variation = 1293 (0.18471%)

A simplistic rand(5)+rand(5), ignoring those cases where this returns more than 6 has a typical variation of 18%, 100 times that of the method shown above:

pax$ python qq_notsogood.py
0:   31756 4.53657
1:   63304 9.04343
2:   95507 13.64386
3:  127825 18.26071
4:  158851 22.69300
5:  127567 18.22386
6:   95190 13.59857
Variation = 127095 (18.15643%)

pax$ python qq_notsogood.py
0:   31792 4.54171
1:   63637 9.09100
2:   95641 13.66300
3:  127627 18.23243
4:  158751 22.67871
5:  126782 18.11171
6:   95770 13.68143
Variation = 126959 (18.13700%)

pax$ python qq_notsogood.py
0:   31955 4.56500
1:   63485 9.06929
2:   94849 13.54986
3:  127737 18.24814
4:  159687 22.81243
5:  127391 18.19871
6:   94896 13.55657
Variation = 127732 (18.24743%)

And, on the advice of Nixuz, I've cleaned the script up so you can just extract and use the rand7... stuff:

import random

# rand5() returns 0 through 4 inclusive.

def rand5():
    return int (random.random () * 5)

# rand7() generator returns 0 through 6 inclusive (using rand5()).

def rand7():
    rand7ret = 0
    while True:
        rand7ret = (rand7ret + rand5()) % 7
        yield rand7ret

# Number of test runs.

count = 700000

# Work out distribution.

distrib = [0,0,0,0,0,0,0]
rgen =rand7()
for i in range (0,count):
    r = rgen.next()
    distrib[r] = distrib[r] + 1

# Print distributions and calculate variation.

high = distrib[0]
low = distrib[0]
for i in range (0,7):
    print "%d: %7d %.5f" % (i, distrib[i], 100.0 * distrib[i] / count)
    if distrib[i] < low:
        low = distrib[i]
    if distrib[i] > high:
        high = distrib[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / count)

edited May 08 '09 at 03:03

answered May 07 '09 at 03:43

paxdiablo

854,327
234
1,573
1,953

Your definition of variance is completely different from the standard statistical definition of variance. – Adam Rosenfield May 08 '09 at 01:17
Variance has many meanings of which the statistical is one, I was obviously using another (which may or may not be in an English dictionary :-) - I was just looking for a word that could be used for the percentage difference between the highest and lowest occurrence (variance, variation, variability), the point being to show the relative distribution-ness of the different methods. [And, yes, I know distribution-ness is almost certainly not a real word :-) ]. Anyway, I'll change it to keep you happy. – paxdiablo May 08 '09 at 01:50
I have posted a cleaner implementation of this method here: http://rafb.net/p/AQXiVL18.html – Nixuz May 08 '09 at 03:03
Again, like many other solutions, this does not achieve a perfectly uniform distribution. After the first several numbers, it approximates a uniform distribution to many orders of magnitude. Th bigger problem, though, is correlation between consecutive numbers produced. If x and y are two consecutive numbers in the RNG sequence, all 49 possible pairs are not even close to being equally probably -- only 35 of the 49 pairs are achievable. – Adam Rosenfield May 08 '09 at 03:18
2

Err, let me rephrase that. Given that a particular x was produced at some point in the sequence, only 5 of 7 numbers can be produced for the next number in the sequence. A true RNG would have all samples be independent of one another, but in this case they are clearly not. – Adam Rosenfield May 08 '09 at 03:20
And that is true of any linear congruential RNG as well, Adam, given the formula they use. Short of having a "Schrodinger's Cat" device in your PC or basing it on some other truly random thing, you're not going to get any better. – paxdiablo May 08 '09 at 03:29
@Pax: see my comments in my answer. We're not talking about linear congruential RNGs here. – Adam Rosenfield May 08 '09 at 04:09
@Adam, that was an assumption *you* introduced in the comments to your answer. It was not in the original question and I would be more likely to believe that rand5() is a deterministic RNG rather than truly random. – paxdiablo May 08 '09 at 07:56
3

It's true that the original question doesn't specify if the input and output functions produce independent and identically-distributed (iid) samples, but I think it's a reasonable expectation that if the input rand5() is iid, then the output rand7() should also be iid. If you don't think that's reasonable, have fun using your non-iid RNG. – Adam Rosenfield May 08 '09 at 19:54
1

So, what's the word from the mathematicians at the university? – Adam Rosenfield May 12 '09 at 02:52
Still waiting, I only see them once a fortnight for our Sunday bike ride and I'm taking number 1 son to the other side of the country tomorrow so it'll be at least another two weeks. – paxdiablo May 12 '09 at 02:59
1

This solution is clearly broken. It's obvious that you need to be calling rand5 (on average) more than once per call to rand7 and this solution doesn't. Therefore the results cannot be random by any sane definition of random. – Chris Suter Sep 09 '09 at 04:11
@Chris, the only sane definition of random is "random", and that means you cannot say *anything* about the distribution except over an infinite sample space. If you have a convincing proof that it's broken other than your vague feelings, let's see it. Pseudo-random numbers are only required to have a roughly equal distribution with a large repeat cycle. This solution has both of those features. – paxdiablo Sep 09 '09 at 04:59
1

@Pax At every iteration of your function, it can only return one of five different values (albeit in the range 0-6). The very first iteration can only return a number in the range 0-4. So, it should be clear that whilst your function may have uniform distribution, the samples are not independent i.e. they're correlated which isn't something you want in a random number generator. – Chris Suter Sep 09 '09 at 05:57
If you understand how linear congruential algorithms work, Chris, you'll know that determinism is a feature of them all. The next value is *wholly* dependent on the one that came before. The trick is to make it seem random by careful selection of the algorithm. If you wanted a truly random sequence, you'd be using quantum effects or a more random seed such as time between user keypresses. – paxdiablo Sep 09 '09 at 06:51
1

@Pax Ignoring the fact that your algorithm isn't actually a conventional LCG algorithm, LCG algorithms are considered low quality pseudorandom number generators. We're not told what the quality of rand5() is, but we must assume that it's high quality and that you'd want the output stream to be no worse. That's not the case with your algorithm. – Chris Suter Sep 09 '09 at 10:12
Why does a uniform rand5()+rand5() approach to the problem generate a non-uniform result? – hookenz Apr 06 '11 at 23:54
@Matt, it's because of the statistical nature of the results. A random number generator which gives a relatively even distribution across the sample space (which it will for a large enough sample) will start to become uneven as soon as you start throwing away values. For example, you will toss away all (4,4) pairs since they total 8 and you will not toss away any (2,2) pairs. It's that selectiveness which will skew the distribution. A rolling method doesn't toss away anything and the distribution remains relatively even. – paxdiablo Apr 07 '11 at 07:18

ShuggyCoUk · Answer 30 · 2009-05-14T10:17:30.820

This answer is more an experiment in obtaining the most entropy possible from the Rand5 function. t is therefore somewhat unclear and almost certainly a lot slower than other implementations.

Assuming the uniform distribution from 0-4 and resulting uniform distribution from 0-6:

public class SevenFromFive
{
  public SevenFromFive()
  {
    // this outputs a uniform ditribution but for some reason including it 
    // screws up the output distribution
    // open question Why?
    this.fifth = new ProbabilityCondensor(5, b => {});
    this.eigth = new ProbabilityCondensor(8, AddEntropy);
  } 

  private static Random r = new Random();
  private static uint Rand5()
  {
    return (uint)r.Next(0,5);
  }

  private class ProbabilityCondensor
  {
    private readonly int samples;
    private int counter;
    private int store;
    private readonly Action<bool> output;

    public ProbabilityCondensor(int chanceOfTrueReciprocal,
      Action<bool> output)
    {
      this.output = output;
      this.samples = chanceOfTrueReciprocal - 1;  
    }

    public void Add(bool bit)
    {
      this.counter++;
      if (bit)
        this.store++;   
      if (counter == samples)
      {
        bool? e;
        if (store == 0)
          e = false;
        else if (store == 1)
          e = true;
        else
          e = null;// discard for now       
        counter = 0;
        store = 0;
        if (e.HasValue)
          output(e.Value);
      }
    }
  }

  ulong buffer = 0;
  const ulong Mask = 7UL;
  int bitsAvail = 0;
  private readonly ProbabilityCondensor fifth;
  private readonly ProbabilityCondensor eigth;

  private void AddEntropy(bool bit)
  {
    buffer <<= 1;
    if (bit)
      buffer |= 1;      
    bitsAvail++;
  }

  private void AddTwoBitsEntropy(uint u)
  {
    buffer <<= 2;
    buffer |= (u & 3UL);    
    bitsAvail += 2;
  }

  public uint Rand7()
  {
    uint selection;   
    do
    {
      while (bitsAvail < 3)
      {
        var x = Rand5();
        if (x < 4)
        {
          // put the two low order bits straight in
          AddTwoBitsEntropy(x);
          fifth.Add(false);
        }
        else
        { 
          fifth.Add(true);
        }
      }
      // read 3 bits
      selection = (uint)((buffer & Mask));
      bitsAvail -= 3;     
      buffer >>= 3;
      if (selection == 7)
        eigth.Add(true);
      else
        eigth.Add(false);
    }
    while (selection == 7);   
    return selection;
  }
}

The number of bits added to the buffer per call to Rand5 is currently 4/5 * 2 so 1.6. If the 1/5 probability value is included that increases by 0.05 so 1.65 but see the comment in the code where I have had to disable this.

Bits consumed by call to Rand7 = 3 + 1/8 * (3 + 1/8 * (3 + 1/8 * (...
This is 3 + 3/8 + 3/64 + 3/512 ... so approx 3.42

By extracting information from the sevens I reclaim 1/8*1/7 bits per call so about 0.018

This gives a net consumption 3.4 bits per call which means the ratio is 2.125 calls to Rand5 for every Rand7. The optimum should be 2.1.

I would imagine this approach is significantly slower than many of the other ones here unless the cost of the call to Rand5 is extremely expensive (say calling out to some external source of entropy).

Your solution appears correct, aside from some simple errors: "if(count > 1)" should be "if(count <= 1)", and the "i++" that occurs shortly thereafter should be inside the curly braces that precede it. I'm not sure whether or not BitsSet() is correct, but that's somewhat irrelevant. — Adam Rosenfield, May 13 '09 at 18:51
Overall, though, your function is very difficult to understand. It does make a _slightly_ better use of entropy than it otherwise could, at the cost of more complication. There's also no reason to initially fill the buffer with 35 random bits on the first call, when 3 would suffice. — Adam Rosenfield, May 13 '09 at 18:56
I corrected the <= thanks, the i++ really should be there though. It should happen on the zero and the 1 case (adding a 1 or a zero respectively to the buffer). This is absolutely not what I would suggest using, it's horribly complicated. I was just interested i how close I could get to the theoretical entropy limits inherent in the problem... Thanks for the feedback. Ironically the filling of the buffer on the first call was to make it simpler to write :) — ShuggyCoUk, May 13 '09 at 20:35
I reworked this to be easier to understand (at the cost of speed) but also made it correct. It is not optimum yet, for some reason the 1/5 bits cause issues even though they are uniform in count. — ShuggyCoUk, May 14 '09 at 10:18

bernard paulus · Answer 31 · 2013-01-31T23:36:11.337

We are using the convention rand(n) -> [0, n - 1] here

From many of the answer I read, they provide either uniformity or halt guarantee, but not both (adam rosenfeld second answer might).

It is, however, possible to do so. We basically have this distribution:

This leaves us a hole in the distribution over [0-6]: 5 and 6 have no probability of ocurrence. Imagine now we try to fill the hole it by shifting the probability distribution and summing.

Indeed, we can the initial distribution with itself shifted by one, and repeating by summing the obtained distribution with the initial one shifted by two, then three and so on, until 7, not included (we covered the whole range). This is shown on the following figure. The order of the colors, corresponding to the steps, is blue -> green -> cyan -> white -> magenta -> yellow -> red.

Because each slot is covered by 5 of the 7 shifted distributions (shift varies from 0 to 6), and because we assume the random numbers are independent from one ran5() call to another, we obtain

p(x) = 5 / 35 = 1 / 7       for all x in [0, 6]

This means that, given 7 independent random numbers from ran5(), we can compute a random number with uniform probability in the [0-6] range. In fact, the ran5() probability distribution does not even need to be uniform, as long as the samples are independent (so the distribution stays the same from trial to trial). Also, this is valid for other numbers than 5 and 7.

This gives us the following python function:

def rand_range_transform(rands):
    """
    returns a uniform random number in [0, len(rands) - 1]
    if all r in rands are independent random numbers from the same uniform distribution
    """
    return sum((x + i) for i, x in enumerate(rands)) % len(rands) # a single modulo outside the sum is enough in modulo arithmetic

This can be used like this:

rand5 = lambda : random.randrange(5)

def rand7():
    return rand_range_transform([rand5() for _ in range(7)])

If we call rand7() 70000 times, we can get:

max: 6 min: 0 mean: 2.99711428571 std: 2.00194697049
0:  10019
1:  10016
2:  10071
3:  10044
4:  9775
5:  10042
6:  10033

This is good, although far from perfect. The fact is, one of our assumption is most likely false in this implementation: we use a PRNG, and as such, the result of the next call is dependent from the last result.

That said, using a truly random source of numbers, the output should also be truly random. And this algorithm terminates in every case.

But this comes with a cost: we need 7 calls to rand5() for a single rand7() call.

score 2 · Answer 32 · answered Dec 05 '09 at 07:27

2

The function you need is rand1_7(), I wrote rand1_5() so that you can test it and plot it.

import numpy
def rand1_5():
    return numpy.random.randint(5)+1

def rand1_7():
    q = 0
    for i in xrange(7):  q+= rand1_5()
    return q%7 + 1

answered Dec 05 '09 at 07:27

Andrea Ambu

38,188
14
54
77

score 2 · Answer 33 · answered Jan 21 '14 at 07:56

This solution doesn't waste any entropy and gives the first available truly random number in range. With each iteration the probability of not getting an answer is provably decreased. The probability of getting an answer in N iterations is the probability that a random number between 0 and max (5^N) will be smaller than the largest multiple of seven in that range (max-max%7). Must iterate at least twice. But that's necessarily true for all solutions.

int random7() {
  range = 1;
  remainder = 0;

  while (1) {
    remainder = remainder * 5 + random5() - 1;
    range = range * 5;

    limit = range - (range % 7);
    if (remainder < limit) return (remainder % 7) + 1;

    remainder = remainder % 7;
    range = range % 7;
  }
}

Numerically equivalent to:

r5=5;
num=random5()-1;
while (1) {
   num=num*5+random5()-1;
   r5=r5*5;
   r7=r5-r5%7;
   if (num<r7) return num%7+1;
}

The first code calculates it in modulo form. The second code is just plain math. Or I made a mistake somewhere. :-)

score 2 · Answer 34 · answered Jun 01 '15 at 08:53

Another answer which appears to have not been covered here:

int rand7() {
  int r = 7 / 2;
  for (int i = 0; i < 28; i++)
    r = ((rand5() - 1) * 7 + r) / 5;
  return r + 1;
}

On every iteration r is a random value between 0 and 6 inclusive. This is appended (base 7) to a random value between 0 and 4 inclusive, and the result is divided by 5, giving a new random value in the range of 0 to 6 inclusive. r starts with a substantial bias (r = 3 is very biased!) but each iteration divides that bias by 5.

This method is not perfectly uniform; however, the bias is vanishingly small. Something in the order of 1/(2**64). What's important about this approach is that it has constant execution time (assuming rand5() also has constant execution time). No theoretical concerns that an unlucky call could iterate forever picking bad values.

Also, a sarcastic answer for good measure (deliberately or not, it has been covered):

1-5 is already within the range 1-7, therefore the following is a valid implementation:

int rand7() {
  return rand5();
}

Question did not ask for uniform distribution.

score 2 · Answer 35 · answered Oct 09 '10 at 06:47

function Rand7
   put 200 into x
   repeat while x > 118
      put ((random(5)-1) * 25) + ((random(5)-1) * 5) + (random(5)-1) into x
   end repeat
   return (x mod 7) + 1
end Rand7

Three calls to Rand5, which only repeats 6 times out of 125, on average.

Think of it as a 3D array, 5x5x5, filled with 1 to 7 over and over, and 6 blanks. Re-roll on the blanks. The rand5 calls create a three digit base-5 index into that array.

There would be fewer repeats with a 4D, or higher N-dimensional arrays, but this means more calls to the rand5 function become standard. You'll start to get diminishing efficiency returns at higher dimensions. Three seems to me to be a good compromise, but I haven't tested them against each other to be sure. And it would be rand5-implementation specific.

score 2 · Answer 36 · answered Nov 30 '10 at 17:22

2

int getOneToSeven(){
    int added = 0;
    for(int i = 1; i<=7; i++){
        added += getOneToFive();
    }
    return (added)%7+1;
}

answered Nov 30 '10 at 17:22

pas1311

1
1

2

This can't be right because there are 5**7 possible outcomes (78125) but 78125%7 = 5 -- meaning that there is an uneven distribution of the seven desired outcomes. – hughdbrown Dec 01 '10 at 18:27

score 2 · Answer 37 · answered Dec 01 '10 at 18:46

This is the simplest answer I could create after reviewing others' answers:

def r5tor7():
    while True:
        cand = (5 * r5()) + r5()
        if cand < 27:
            return cand

cand is in the range [6, 27] and the possible outcomes are evenly distributed if the possible outcomes from r5() are evenly distributed. You can test my answer with this code:

from collections import defaultdict

def r5_outcome(n):
    if not n:
        yield []
    else:
        for i in range(1, 6):
            for j in r5_outcome(n-1):
                yield [i] + j

def test_r7():
    d = defaultdict(int)
    for x in r5_outcome(2):
        s = sum([x[i] * 5**i for i in range(len(x))])
        if s < 27:
            d[s] += 1
    print len(d), d

r5_outcome(2) generates all possible combinations of r5() results. I use the same filter to test as in my solution code. You can see that all of the outcomes are equally probably because they have the same value.

score 2 · Answer 38 · edited Jan 18 '11 at 06:59

package CareerCup;

public class RangeTransform {
 static int counter = (int)(Math.random() * 5 + 1);

 private int func() {
  return (int) (Math.random() * 5 + 1);
 }

 private int getMultiplier() {
  return counter % 5 + 1;
 }

 public int rangeTransform() {
  counter++;
  int count = getMultiplier();
  int mult = func() + 5 * count;
  System.out.println("Mult is : " + 5 * count);
  return (mult) % 7 + 1;
 }

 /**
  * @param args
  */
 public static void main(String[] args) {
  // TODO Auto-generated method stub
  RangeTransform rangeTransform = new RangeTransform();
  for (int i = 0; i < 35; i++)
   System.out.println("Val is : " + rangeTransform.rangeTransform());
 }
}

score 2 · Answer 39 · edited Dec 01 '12 at 05:29

2

Why won't this work? Other then the one extra call to rand5()?

i = rand5() + rand5() + (rand5() - 1) //Random number between 1 and 14

i = i % 7 + 1;

edited Dec 01 '12 at 05:29

Andro Selva

53,910
52
193
240

answered Jan 27 '11 at 07:29

Sam

11
3

1

First, the comment is wrong - rand5() + rand5() + rand5() - 1 will always be at least 2. Regardless, though, your solution is not uniform; 6's are about 20% more likely than 3's. – Mark Reed Mar 28 '12 at 01:07

Rob Leclerc · Answer 40 · 2011-03-16T21:30:07.907

2

For values 0-7 you have the following:

From bitwise from left to right Rand5() has p(1) = {2/5, 2/5, 3/5}. So if we complement those probability distributions (~Rand5()) we should be able to use that to produce our number. I'll try to report back later with a solution. Anyone have any thoughts?

R

edited Mar 16 '11 at 21:30

answered Mar 16 '11 at 17:07

Rob Leclerc

898
9
11

score 2 · Answer 41 · edited Mar 18 '11 at 15:48

2

rand25() =5*(rand5()-1) + rand5()

rand7() { 
   while(true) {
       int r = rand25();
       if (r < 21) return r%3;         
   }
}

Why this works: probability that the loop will run forever is 0.

edited Mar 18 '11 at 15:48

Joachim Sauer

302,674
57
556
614

answered Mar 18 '11 at 15:32

ekr

1
1

score 2 · Answer 42 · edited Dec 14 '11 at 10:00

2

Assuming rand gives equal weighting to all bits, then masks with the upper bound.

int i = rand(5) ^ (rand(5) & 2);

rand(5) can only return: 1b, 10b, 11b, 100b, 101b. You only need to concern yourself with sometimes setting the 2 bit.

edited Dec 14 '11 at 10:00

JMax

26,109
12
69
88

answered Apr 04 '11 at 10:24

Ben

41
1

ming_codes · Answer 43 · 2011-08-26T19:21:27.570

2

Here's what I've found:

Random5 produces a range from 1~5, randomly distributed
If we run it 3 times and add them together we'll get a range of 3~15, randomly distributed
Perform arithmetic on the 3~15 range
1. (3~15) - 1 = (2~14)
2. (2~14)/2 = (1~7)

Then we get a range of 1~7, which is the Random7 we're looking for.

edited Aug 26 '11 at 19:21

answered Aug 26 '11 at 19:10

ming_codes

2,870
25
24

2

Random7 is not uniformally distributed. Take step 2. What are the odds of 3 being generated? 1/125. How about 4? 3/125. Assuming integer division in step 3, these are the only 2 ways of generating 1 in Random7. That only happens 4/125 of the time. Now maybe my mistake is using the word _uniformally_. – demongolem Jun 08 '12 at 17:09
The 3~15 probability distribution is not uniform, it is a bell curve, therefore this is incorrect. – Muhd Aug 21 '12 at 00:48

score 1 · Answer 44 · edited Jun 20 '20 at 09:12

1

First thing came on my mind is this. But i have no idea whether its uniformly distributed. Implemented in python

import random

def rand5():

return random.randint(1,5)

def rand7():

return ( ( (rand5() -1) * rand5() ) %7 )+1

edited Jun 20 '20 at 09:12

Community

1
1

answered Apr 30 '12 at 04:33

ASKN

356
2
10

score 1 · Answer 45 · answered Jan 25 '13 at 18:54

Here's my general implementation, to generate a uniform in the range [0,N-1] given a uniform generator in the range [0,B-1].

public class RandomUnif {

    public static final int BASE_NUMBER = 5;

    private static Random rand = new Random();

    /** given generator, returns uniform integer in the range 0.. BASE_NUMBER-1
    public static int randomBASE() {
        return rand.nextInt(BASE_NUMBER);
    }

    /** returns uniform integer in the range 0..n-1 using randomBASE() */
    public static int randomUnif(int n) {
        int rand, factor;
        if( n <= 1 ) return 0;
        else if( n == BASE_NUMBER ) return randomBASE();
        if( n < BASE_NUMBER ) {
            factor = BASE_NUMBER / n;
            do
                rand = randomBASE() / factor;
            while(rand >= n);
            return rand;
        } else {
            factor = (n - 1) / BASE_NUMBER + 1;
            do {
                rand = factor * randomBASE() + randomUnif(factor);
            } while(rand >= n);
            return rand;
        }
    }
}

Not spectaculary efficient, but general and compact. Mean calls to base generator:

score 1 · Answer 46 · edited Jan 31 '14 at 14:03

There are a lot of solutions here that do not produce a uniform distribution and many comments pointing that out, but the the question does not state that as a requirement. The simplest solution is:

int rand_7() { return rand_5(); }

A random integer in the range 1 - 5 is clearly in the range 1 - 7. Well, technically, the simplest solution is to return a constant, but that's too trivial.

However, I think the existence of the rand_5 function is a red herring. Suppose the question was asked as "produce a uniformly distributed pseudo-random number generator with integer output in the range 1 - 7". That's a simple problem (not technically simple, but already solved, so you can look it up.)

On the other hand, if the question is interpreted to mean that you actually have a truly random number generator for integers in the range 1 - 5 (not pseudo random), then the solution is:

1) examine the rand_5 function
2) understand how it works
3) profit

dansalmo · Answer 47 · 2013-12-18T21:23:51.747

function rand7() {
    while (true) { //lowest base 5 random number > 7 reduces memory
        int num = (rand5()-1)*5 + rand5()-1;
    if (num < 21)  // improves performance
        return 1 + num%7;
    }
}

Python code:

from random import randint
def rand7():
    while(True):
        num = (randint(1, 5)-1)*5 + randint(1, 5)-1
        if num < 21:
                return 1 + num%7

Test distribution for 100000 runs:

>>> rnums = []
>>> for _ in range(100000):
    rnums.append(rand7())
>>> {n:rnums.count(n) for n in set(rnums)}
{1: 15648, 2: 15741, 3: 15681, 4: 15847, 5: 15642, 6: 15806, 7: 15635}

score 1 · Answer 48 · answered Jan 06 '14 at 19:26

1

This is similiarly to @RobMcAfee except that I use magic number instead of 2 dimensional array.

int rand7() {
    int m = 1203068;
    int r = (m >> (rand5() - 1) * 5 + rand5() - 1) & 7;

    return (r > 0) ? r : rand7();
}

answered Jan 06 '14 at 19:26

invisal

11,075
4
33
54

jantimon · Answer 49 · 2014-03-17T23:55:50.233

This solution was inspired by Rob McAfee.
However it doesn't need a loop and the result is a uniform distribution:

// Returns 1-5
var rnd5 = function(){
   return parseInt(Math.random() * 5, 10) + 1;
}
// Helper
var lastEdge = 0;
// Returns 1-7
var rnd7 = function () {
  var map = [
     [ 1, 2, 3, 4, 5 ],
     [ 6, 7, 1, 2, 3 ],
     [ 4, 5, 6, 7, 1 ],
     [ 2, 3, 4, 5, 6 ],
     [ 7, 0, 0, 0, 0 ]
  ];
  var result = map[rnd5() - 1][rnd5() - 1];
  if (result > 0) {
    return result;
  }
  lastEdge++;
  if (lastEdge > 7 ) {
    lastEdge = 1;
  }
  return lastEdge;
};

// Test the a uniform distribution
results = {}; for(i=0; i < 700000;i++) { var rand = rnd7(); results[rand] = results[rand] ? results[rand] + 1 : 1;} 
console.log(results)

Result: [1: 99560, 2: 99932, 3: 100355, 4: 100262, 5: 99603, 6: 100062, 7: 100226]

jsFiddle

It is a uniform distribution, but it is not totally random, because in the first call, you are more likely to get the 1, in the second call the 2, and so on. This would only make sense if the consumer of the function has no way to know if it's the Nth call. Else you have 7/25 chance of guessing the number against 3/25... — Martin, Oct 16 '14 at 12:25

Graham Toal · Answer 50 · 2014-03-28T20:58:24.350

1

I think y'all are overthinking this. Doesn't this simple solution work?

int rand7(void)
{
    static int startpos = 0;
    startpos = (startpos+5) % (5*7);
    return (((startpos + rand5()-1)%7)+1);
}

edited Mar 28 '14 at 20:58

answered Mar 28 '14 at 20:23

Graham Toal

324
1
7

score 1 · Answer 51 · answered May 25 '14 at 10:53

Given a function which produces a random integer in the range 1 to 5 rand5(), write a function which produces a random integer in the range 1 to 7 rand7()

In my proposed solution, I only call rand5 once only

Real Solution

float rand7()
{
    return (rand5() * 7.0) / 5.0 ;
}

The distribution here is scaled, so it depends directly on the distribution of rand5

Integer Solution

int rand7()
{
    static int prev = 1;

    int cur = rand5();

    int r = cur * prev; // 1-25

    float f = r / 4.0; // 0.25-6.25

    f = f - 0.25; // 0-6

    f = f + 1.0; // 1-7

    prev = cur;

    return (int)f;
}

The distribution here depends on the series rand7(i) ~ rand5(i) * rand5(i-1)

with rand7(0) ~ rand5(0) * 1

score 1 · Answer 52 · answered Sep 06 '14 at 18:16

Here is an answer taking advantage of features in C++ 11

#include <functional>
#include <iostream>
#include <ostream>
#include <random>

int main()
{
    std::random_device rd;
    unsigned long seed = rd();
    std::cout << "seed = " << seed << std::endl;

    std::mt19937 engine(seed);

    std::uniform_int_distribution<> dist(1, 5);
    auto rand5 = std::bind(dist, engine);

    const int n = 20;
    for (int i = 0; i != n; ++i)
    {
        std::cout << rand5() << " ";
    }
    std::cout << std::endl;

    // Use a lambda expression to define rand7
    auto rand7 = [&rand5]()->int
    {
        for (int result = 0; ; result = 0)
        {
            // Take advantage of the fact that
            // 5**6 = 15625 = 15624 + 1 = 7 * (2232) + 1.
            // So we only have to discard one out of every 15625 numbers generated.

            // Generate a 6-digit number in base 5
            for (int i = 0; i != 6; ++i)
            {
                result = 5 * result + (rand5() - 1);
            }

            // result is in the range [0, 15625)
            if (result == 15625 - 1)
            {
                // Discard this number
                continue;
            }

            // We now know that result is in the range [0, 15624), a range that can
            // be divided evenly into 7 buckets guaranteeing uniformity
            result /= 2232;
            return 1 + result;
        }
    };

    for (int i = 0; i != n; ++i)
    {
        std::cout << rand7() << " ";
    }
    std::cout << std::endl;

    return 0;
}

Martin · Answer 53 · 2014-10-16T13:45:58.190

Would be cool if someone could give me feedback on this one, I used the JUNIT without assert Pattern because it's easy and fast to get it running in Eclipse, I could also have just defined a main method. By the way, I am assuming rand5 gives values 0-4, adding 1 would make it 1-5, same with rand7... So the discussion should be on the solution, it's distribution, not on wether it goes from 0-4 or 1-5...

package random;

import java.util.Random;

import org.junit.Test;

public class RandomTest {


    @Test
    public void testName() throws Exception {
        long times = 100000000;
        int indexes[] = new int[7];
        for(int i = 0; i < times; i++) {
            int rand7 = rand7();
            indexes[rand7]++;
        }

        for(int i = 0; i < 7; i++)
            System.out.println("Value " + i + ": " + indexes[i]);
    }


    public int rand7() {
        return (rand5() + rand5() + rand5() + rand5() + rand5() + rand5() + rand5()) % 7;
    }


    public int rand5() {
        return new Random().nextInt(5);
    }


}

When I run it, I get this result:

Value 0: 14308087
Value 1: 14298303
Value 2: 14279731
Value 3: 14262533
Value 4: 14269749
Value 5: 14277560
Value 6: 14304037

This seems like a very fair distribution, doesn't it?

If I add rand5() less or more times (where the amount of times is not divisible by 7), the distribution clearly shows offsets. For instance, adding rand5() 3 times:

Value 0: 15199685
Value 1: 14402429
Value 2: 12795649
Value 3: 12796957
Value 4: 14402252
Value 5: 15202778
Value 6: 15200250

So, this would lead to the following:

public int rand(int range) {
    int randomValue = 0;
    for(int i = 0; i < range; i++) {
        randomValue += rand5();
    }
    return randomValue % range;

}

And then, I could go further:

public static final int ORIGN_RANGE = 5;
public static final int DEST_RANGE  = 7;

@Test
public void testName() throws Exception {
    long times = 100000000;
    int indexes[] = new int[DEST_RANGE];
    for(int i = 0; i < times; i++) {
        int rand7 = convertRand(DEST_RANGE, ORIGN_RANGE);
        indexes[rand7]++;
    }

    for(int i = 0; i < DEST_RANGE; i++)
        System.out.println("Value " + i + ": " + indexes[i]);
}


public int convertRand(int destRange, int originRange) {
    int randomValue = 0;
    for(int i = 0; i < destRange; i++) {
        randomValue += rand(originRange);
    }
    return randomValue % destRange;

}


public int rand(int range) {
    return new Random().nextInt(range);
}

I tried this replacing the destRange and originRange with various values (even 7 for ORIGIN and 13 for DEST), and I get this distribution:

Value 0: 7713763
Value 1: 7706552
Value 2: 7694697
Value 3: 7695319
Value 4: 7688617
Value 5: 7681691
Value 6: 7674798
Value 7: 7680348
Value 8: 7685286
Value 9: 7683943
Value 10: 7690283
Value 11: 7699142
Value 12: 7705561

What I can conclude from here is that you can change any random to anyother by suming the origin random "destination" times. This will get a kind of gaussian distribution (being the middle values more likely, and the edge values more uncommon). However, the modulus of destination seems to distribute itself evenly across this gaussian distribution... It would be great to have feedback from a mathematician...

What is cool is that the cost is 100% predictable and constant, whereas other solutions cause a small probability of infinite loop...

score 1 · Answer 54 · edited May 23 '17 at 12:18

Similar to Martin's answer, but resorts to throwing entropy away much less frequently:

int rand7(void) {
  static int m = 1;
  static int r = 0;

  for (;;) {
    while (m <= INT_MAX / 5) {
      r = r + m * (rand5() - 1);
      m = m * 5;
    }
    int q = m / 7;
    if (r < q * 7) {
      int i = r % 7;
      r = r / 7;
      m = q;
      return i + 1;
    }
    r = r - q * 7;
    m = m - q * 7;
  }
}

Here we build up a random value between 0 and m-1, and try to maximise m by adding as much state as will fit without overflow (INT_MAX being the largest value that will fit in an int in C, or you can replace that with any large value that makes sense in your language and architecture).

Then; if r falls within the largest possible interval evenly divisible by 7 then it contains a viable result and we can divide that interval by 7 and take the remainder as our result and return the rest of the value to our entropy pool. Otherwise r is in the other interval which doesn't divide evenly and we have to discard and restart our entropy pool from that ill-fitting interval.

Compared with the popular answers in here, it calls rand5() about half as often on average.

The divides can be factored out into trivial bit-twiddles and LUTs for performance.

score 1 · Answer 55 · answered Apr 01 '17 at 19:30

1

What is a simple solution? (rand5() + rand5()) % 7 + 1
What is an effective solution to reduce memory usage or run on a slower CPU? Yes, this is effective as it calls rand5() only twice and have O(1) space complexity

Consider rand5() gives out random numbers from 1 to 5(inclusive).
(1 + 1) % 7 + 1 = 3
(1 + 2) % 7 + 1 = 4
(1 + 3) % 7 + 1 = 5
(1 + 4) % 7 + 1 = 6
(1 + 5) % 7 + 1 = 7

(2 + 1) % 7 + 1 = 4
(2 + 2) % 7 + 1 = 5
(2 + 3) % 7 + 1 = 6
(2 + 4) % 7 + 1 = 7
(2 + 5) % 7 + 1 = 1
...

(5 + 1) % 7 + 1 = 7
(5 + 2) % 7 + 1 = 1
(5 + 3) % 7 + 1 = 2
(5 + 4) % 7 + 1 = 3
(5 + 5) % 7 + 1 = 4
...

and so on

answered Apr 01 '17 at 19:30

Devendra Lattu

2,732
2
18
27

What do you do at (7+5)% 7 +1 = ??, Does this add a bias or require a re roll, what issues are there with your first answer? – daniel Apr 24 '17 at 13:44
@daniel, this won't be the case. `rand5()` will not generate a random number `7`. Thus at max, it can be `(5 + 5) % 7 + 1 = 4`. – Devendra Lattu Apr 24 '17 at 14:37
So its not uniform, you are allowing 1,2,3,4 to be picked more often. If you look at the most up voted answer your solution is like his except instead of 0,0,0,0 you have 1,2,3,4 – daniel Apr 24 '17 at 15:03
1

Why is this answer not correct ? The problem statement does not require the result to be unifromly distributed. – RBF06 Dec 29 '17 at 14:32

score 1 · Answer 56 · answered Feb 18 '11 at 02:43

1

how about this

rand5()%2+rand5()%2+rand5()%2+rand5()%2+rand5()%2+rand5()%2

Not sure this is uniform distributed. Any suggestions?

answered Feb 18 '11 at 02:43

rich

1
1

1

It's not uniform because rand5()%2 is not uniform. 5 possible answers can't be split evenly into two categories - the 0/1 split will be 60/40 instead of 50/50 (60% 0's if rand5 returns 0..4, 60% 1's if it returns 1..5). – Mark Reed Mar 28 '12 at 01:09

ibrahim tanyalcin · Answer 57 · 2018-09-10T08:34:10.760

Came here from a link from expanding a float range. This one is more fun. Instead of how I got to the conclusion, it occurred to me that for a given random integer generating function f with "base" b (4 in this case,i'll tell why), it can be expanded like below:

(b^0 * f() + b^1 * f() + b^2 * f() .... b^p * f()) / (b^(p+1) - 1) * (b-1)

This will convert a random generator to a FLOAT generator. I will define 2 parameters here the b and the p. Although the "base" here is 4, b can in fact be anything, it can also be an irrational number etc. p, i call precision is a degree of how well grained you want your float generator to be. Think of this as the number of calls made to rand5 for each call of rand7.

But I realized if you set b to base+1 (which is 4+1 = 5 in this case), it's a sweet spot and you'll get a uniform distribution. First get rid of this 1-5 generator, it is in truth rand4() + 1:

function rand4(){
    return Math.random() * 5 | 0;
}

To get there, you can substitute rand4 with rand5()-1

Next is to convert rand4 from an integer generator to a float generator

function toFloat(f,b,p){
    b = b || 2;
    p = p || 3;
    return (Array.apply(null,Array(p))
    .map(function(d,i){return f()})
    .map(function(d,i){return Math.pow(b,i)*d})
    .reduce(function(ac,d,i){return ac += d;}))
    /
    (
        (Math.pow(b,p) - 1)
        /(b-1)
    )
}

This will apply the first function I wrote to a given rand function. Try it:

toFloat(rand4) //1.4285714285714286 base = 2, precision = 3
toFloat(rand4,3,4) //0.75 base = 3, precision = 4
toFloat(rand4,4,5) //3.7507331378299122 base = 4, precision = 5
toFloat(rand4,5,6) //0.2012288786482335 base = 5, precision =6
...

Now you can convert this float range (0-4 INCLUSIVE) to any other float range and then downgrade it to be an integer. Here our base is 4 because we are dealing with rand4, therefore a value b=5 will give you a uniform distribution. As the b grows past 4, you will start introducing periodic gaps in the distribution. I tested for b values ranging from 2 to 8 with 3000 points each and compared to native Math.random of javascript, looks to me even better than the native one itself:

http://jsfiddle.net/ibowankenobi/r57v432t/

For the above link, click on the "bin" button on the top side of the distributions to decrease the binning size. The last graph is native Math.random, the 4th one where d=5 is uniform.

After you get your float range either multiply with 7 and throw the decimal part or multiply with 7, subtract 0.5 and round:

((toFloat(rand4,5,6)/4 * 7) | 0) + 1   ---> occasionally you'll get 8 with 1/4^6 probability.
Math.round((toFloat(rand4,5,6)/4 * 7) - 0.5) + 1 --> between 1 and 7

score 1 · Answer 58 · edited Apr 05 '11 at 09:42

1

I thought of an interesting solution to this problem and wanted to share it.

function rand7() {

    var returnVal = 4;

    for (var n=0; n<3; n++) {
        var rand = rand5();

        if (rand==1||rand==2){
            returnVal+=1;
        }
        else if (rand==3||rand==4) {
            returnVal-=1;
        }
    }

    return returnVal;
}

I built a test function that loops through rand7() 10,000 times, sums up all of the return values, and divides it by 10,000. If rand7() is working correctly, our calculated average should be 4 - for example, (1+2+3+4+5+6+7 / 7) = 4. After doing multiple tests, the average is indeed right at 4 :)

edited Apr 05 '11 at 09:42

Mr.Wizard

24,179
5
44
125

answered Apr 01 '11 at 02:58

Eric Rowell

1
1

3

if you got all ones and sevens the average could be 4. – dqhendricks Apr 01 '11 at 15:10
dqhendricks is right. You need to show that the _distribution_ of the values 1-7 that your solution gives, is the isomorphic to the original distribution. (In other words, that changing the range didn't make the odds "less random".) – Shalom Craimer Jan 11 '12 at 09:49
Your mean is 4, but the distribution isn't uniform. I don't think you can produce a uniform distribution using less than 7 calls of rand5(), both 5 and 7 being primes. – Alexey Feldgendler Nov 25 '12 at 21:54

score 0 · Answer 59 · answered Apr 05 '12 at 20:27

0

Why don't you just divide by 5 and multiply by 7, and then round? (Granted, you would have to use floating-point no.s)

It's much easier and more reliable (really?) than the other solutions. E.g. in Python:

def ranndomNo7():
    import random
    rand5 = random.randint(4)    # Produces range: [0, 4]
    rand7 = int(rand5 / 5 * 7)   # /5, *7, +0.5 and floor()
    return rand7

Wasn't that easy?

answered Apr 05 '12 at 20:27

Yatharth Agarwal

4,385
2
24
53

3

You still only end up with 5 distinct integers, not 7. You just have changed the 5 integers that are generated – demongolem Jun 08 '12 at 16:59

score 0 · Answer 60 · answered Sep 25 '12 at 15:11

0

int rand7()
{
    return ( rand5() + (rand5()%3) );
}

rand5() - Returns values from 1-5
rand5()%3 - Returns values from 0-2
So, when summing up the total value will be between 1-7

answered Sep 25 '12 at 15:11

coder

251
3
4

Indeed, suming randoms will produce like a gaussian result, being values in the middle range more common than values on the edges... – Martin Oct 16 '14 at 12:18
This is not uniform – ldmtwo Mar 15 '22 at 02:08

score 0 · Answer 61 · answered Sep 02 '13 at 12:34

0

This expression is sufficient to get random integers between 1 - 7

int j = ( rand5()*2 + 4 ) % 7 + 1;

answered Sep 02 '13 at 12:34

user2713461

387
2
5
16

3

As long as you don't need any 3's or 5's. – dansalmo Dec 18 '13 at 21:03

score 0 · Answer 62 · answered Oct 15 '14 at 02:53

The simple solution has been well covered: take two random5 samples for one random7 result and do it over if the result is outside the range that generates a uniform distribution. If your goal is to reduce the number of calls to random5 this is extremely wasteful - the average number of calls to random5 for each random7 output is 2.38 rather than 2 due to the number of thrown away samples.

You can do better by using more random5 inputs to generate more than one random7 output at a time. For results calculated with a 31-bit integer, the optimum comes when using 12 calls to random5 to generate 9 random7 outputs, taking an average of 1.34 calls per output. It's efficient because only 2018983 out of 244140625 results need to be scrapped, or less than 1%.

Demo in Python:

def random5():
    return random.randint(1, 5)

def random7gen(n):
    count = 0
    while n > 0:
        samples = 6 * 7**9
        while samples >= 6 * 7**9:
            samples = 0
            for i in range(12):
                samples = samples * 5 + random5() - 1
                count += 1
        samples //= 6
        for outputs in range(9):
            yield samples % 7 + 1, count
            samples //= 7
            count = 0
            n -= 1
            if n == 0: break

>>> from collections import Counter
>>> Counter(x for x,i in random7gen(10000000))
Counter({2: 1430293, 4: 1429298, 1: 1428832, 7: 1428571, 3: 1428204, 5: 1428134, 6: 1426668})
>>> sum(i for x,i in random7gen(10000000)) / 10000000.0
1.344606

score 0 · Answer 63 · answered Oct 23 '14 at 22:27

First, I move ramdom5() on the 1 point 6 times, to get 7 random numbers. Second, I add 7 numbers to obtain common sum. Third, I get remainder of the division at 7. Last, I add 1 to get results from 1 till 7. This method gives an equal probability of getting numbers in the range from 1 to 7, with the exception of 1. 1 has a slightly higher probability.

public int random7(){
    Random random = new Random();
    //function (1 + random.nextInt(5)) is given
    int random1_5 = 1 + random.nextInt(5); // 1,2,3,4,5
    int random2_6 = 2 + random.nextInt(5); // 2,3,4,5,6
    int random3_7 = 3 + random.nextInt(5); // 3,4,5,6,7
    int random4_8 = 4 + random.nextInt(5); // 4,5,6,7,8
    int random5_9 = 5 + random.nextInt(5); // 5,6,7,8,9
    int random6_10 = 6 + random.nextInt(5); //6,7,8,9,10
    int random7_11 = 7 + random.nextInt(5); //7,8,9,10,11

    //sumOfRandoms is between 28 and 56
    int sumOfRandoms = random1_5 + random2_6 + random3_7 + 
                       random4_8 + random5_9 + random6_10 + random7_11;
    //result is number between 0 and 6, and
    //equals 0 if sumOfRandoms = 28 or 35 or 42 or 49 or 56 , 5 options
    //equals 1 if sumOfRandoms = 29 or 36 or 43 or 50, 4 options
    //equals 2 if sumOfRandoms = 30 or 37 or 44 or 51, 4 options
    //equals 3 if sumOfRandoms = 31 or 38 or 45 or 52, 4 options
    //equals 4 if sumOfRandoms = 32 or 39 or 46 or 53, 4 options
    //equals 5 if sumOfRandoms = 33 or 40 or 47 or 54, 4 options
    //equals 6 if sumOfRandoms = 34 or 41 or 48 or 55, 4 options
    //It means that the probabilities of getting numbers between 0 and 6 are almost equal.
    int result = sumOfRandoms % 7;
    //we should add 1 to move the interval [0,6] to the interval [1,7]
    return 1 + result;
}

How is `sumOfRandoms` different from the sum of seven `random.nextInt(5)`-calls, + 21? Are the options listed above equally probable? — greybeard, Oct 23 '14 at 22:50
Yes, you are right. There are no difference between this methods. — Michael Katkov, Oct 24 '14 at 01:34
Throw a a pair of cubic dice: what are the probabilities of a sum of 2, 3, 4, 5, 6, 7? — greybeard, Oct 24 '14 at 08:15
1/36,1/18,1/12,1/9,5/36,1/6. Yes, you are right, my arguments were not accurate. I didn't count probabilities in sum. I jumped over it. Thank you for your help. — Michael Katkov, Oct 24 '14 at 15:31

score 0 · Answer 64 · edited Jun 20 '20 at 09:12

Here's mine, this attempts to recreate Math.random() from multiple rand5() function calls, reconstructing a unit interval (the output range of Math.random()) by re-constructing it with "weighted fractions"(?). Then using this random unit interval to produce a random integer between 1 and 7:

function rand5(){
  return Math.floor(Math.random()*5)+1;
}
function rand7(){
  var uiRandom=0;
  var div=1;
  for(var i=0; i<7; i++){
    div*=5;
    var term=(rand5()-1)/div;
    uiRandom+=term;
  }
  //return uiRandom;
  return Math.floor(uiRandom*7)+1; 
}

To paraphrase: We take a random integers between 0-4 (just rand5()-1) and multiply each result with 1/5, 1/25, 1/125, ... and then sum them together. It's similar to how binary weighted fractions work; I suppose instead, we'll call it a quinary (base-5) weighted fraction: Producing a number from 0 -- 0.999999 as a series of (1/5)^n terms.

Modifying the function to take any input/output random integer range should be trivial. And the code above can be optimized when rewritten as a closure.

Alternatively, we can also do this:

function rand5(){
  return Math.floor(Math.random()*5)+1;
}
function rand7(){
  var buffer=[];
  var div=1;
  for (var i=0; i<7; i++){
    buffer.push((rand5()-1).toString(5));
    div*=5;
  }
  var n=parseInt(buffer.join(""),5);
  var uiRandom=n/div;
  //return uiRandom;
  return Math.floor(uiRandom*7)+1; 
}

Instead of fiddling with constructing a quinary (base-5) weighted fractions, we'll actually make a quinary number and turn it into a fraction (0--0.9999... as before), then compute our random 1--7 digit from there.

Results for above (code snippet #2: 3 runs of 100,000 calls each):

1: 14263; 2: 14414; 3: 14249; 4: 14109; 5: 14217; 6: 14361; 7: 14387

1: 14205; 2: 14394; 3: 14238; 4: 14187; 5: 14384; 6: 14224; 7: 14368

1: 14425; 2: 14236; 3: 14334; 4: 14232; 5: 14160; 6: 14320; 7: 14293

Likable for using an approach symmetrical to former ones. What is special about loop count 7? — greybeard, Oct 24 '14 at 08:29
@greybeard Initially I was thinking in terms of least common multiple so that a 1-7 result could evenly be obtained from seven calls to 1-5 generator using some other method. But I changed things a bit. Now, I figure a 7 digits of a 5-nary number is equivalent to 5 digits of a 7-nary number in terms of obtaining obtaining 1-7 value from a distribution of 1-5 values without any bias, skew, or quantization errors. This is very obvious when you consider one or two loops and try to obtain a 1-7 result from it. So I used 7-loops to prevent any kind of "mis-alignment" (I think). — jongo45, Oct 24 '14 at 08:57

Colin Su · Answer 65 · 2015-04-27T08:14:21.333

the main conception of this problem is about normal distribution, here provided a simple and recursive solution to this problem

presume we already have rand5() in our scope:

def rand7():
    # twoway = 0 or 1 in the same probability
    twoway = None
    while not twoway in (1, 2):
        twoway = rand5()
    twoway -= 1

    ans = rand5() + twoway * 5

    return ans if ans in range(1,8) else rand7()

Explanation

We can divide this program into 2 parts:

looping rand5() until we found 1 or 2, that means we have 1/2 probability to have 1 or 2 in the variable twoway
composite ans by rand5() + twoway * 5, this is exactly the result of rand10(), if this did not match our need (1~7), then we run rand7 again.

P.S. we cannot directly run a while loop in the second part due to each probability of twoway need to be individual.

But there is a trade-off, because of the while loop in the first section and the recursion in the return statement, this function doesn't guarantee the execution time, it is actually not effective.

Result

I've made a simple test for observing the distribution to my answer.

result = [ rand7() for x in xrange(777777) ]

ans = {
    1: 0,
    2: 0,
    3: 0,
    4: 0,
    5: 0,
    6: 0,
    7: 0,
}

for i in result:
    ans[i] += 1

print ans

It gave

{1: 111170, 2: 110693, 3: 110651, 4: 111260, 5: 111197, 6: 111502, 7: 111304}

Therefore we could know this answer is in a normal distribution.

Simplified Answer

If you don't care about the execution time of this function, here's a simplified answer based on the above answer I gave:

def rand7():
    ans = rand5() + (rand5()-1) * 5
    return ans if ans < 8 else rand7()

This augments the probability of value which is greater than 8 but probably will be the shortest answer to this problem.

score 0 · Answer 66 · answered Aug 11 '15 at 18:38

This algorithm reduces the number of calls of rand5 to the theoretical minimum of 7/5. Calling it 7 times by produce the next 5 rand7 numbers.

There are no rejection of any random bit, and there are NO possibility to keep waiting the result for always.

#!/usr/bin/env ruby

# random integer from 1 to 5
def rand5
    STDERR.putc '.'
    1 + rand( 5 )
end

@bucket = 0
@bucket_size = 0

# random integer from 1 to 7
def rand7
    if @bucket_size == 0
        @bucket = 7.times.collect{ |d| rand5 * 5**d }.reduce( &:+ )
        @bucket_size = 5
    end

    next_rand7 = @bucket%7 + 1

    @bucket      /= 7
    @bucket_size -= 1

    return next_rand7
end

35.times.each{ putc rand7.to_s }

Emil Jeřábek · Answer 67 · 2019-08-28T13:28:39.977

Here is a solution that tries to minimize the number of calls to rand5() while keeping the implementation simple and efficient; in particular, it does not require arbitrary large integers unlike Adam Rosenfield’s second answer. It exploits the fact that 23/19 = 1.21052... is a good rational approximation to log(7)/log(5) = 1.20906..., thus we can generate 19 random elements of {1,...,7} out of 23 random elements of {1,...,5} by rejection sampling with only a small rejection probability. On average, the algorithm below takes about 1.266 calls to rand5() for each call to rand7(). If the distribution of rand5() is uniform, so is rand7().

uint_fast64_t pool;

int capacity = 0;

void new_batch (void)
{
  uint_fast64_t r;
  int i;

  do {
    r = 0;
    for (i = 0; i < 23; i++)
      r = 5 * r + (rand5() - 1);
  } while (r >= 11398895185373143ULL);  /* 7**19, a bit less than 5**23 */

  pool = r;
  capacity = 19;
}

int rand7 (void)
{
  int r;

  if (capacity == 0)
    new_batch();

  r = pool % 7;
  pool /= 7;
  capacity--;

  return r + 1;
}

score 0 · Answer 68 · answered Aug 31 '20 at 23:24

For the range [1, 5] to [1, 7], this is equivalent to rolling a 7-sided die with a 5-sided one.

However, this can't be done without "wasting" randomness (or running forever in the worst case), since all the prime factors of 7 (namely 7) don't divide 5. Thus, the best that can be done is to use rejection sampling to get arbitrarily close to no "waste" of randomness (such as by batching multiple rolls of the 5-sided die until 5^n is "close enough" to a power of 7). Solutions to this problem were already given in other answers.

More generally, an algorithm to roll a k-sided die with a p-sided die will inevitably "waste" randomness (and run forever in the worst case) unless "every prime number dividing k also divides p", according to Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner. For example, take the much more practical case that p is a power of 2 and k is arbitrary. In this case, this "waste" and indefinite running time are inevitable unless k is also a power of 2.

score 0 · Answer 69 · answered Mar 15 '22 at 02:36

Python: There's a simple two line answer that uses a combination of spatial algebra and modulus. This is not intuitive. My explanation of it is confusing, but is correct.

Knowing that 5*7=35 and 7/5 = 1 remainder 2. How to guarantee that sum of remainders is always 0? 5*[7/5 = 1 remainder 2] --> 35/5 = 7 remainder 0

Imagine we had a ribbon that was wrapped around a pole with a perimeter=7. The ribbon would need to be 35 units to wrap evenly. Select 7 random ribbon pieces len=[1...5]. The effective length ignoring the wrap around is the same as this method of converting rand5() into rand7().

import numpy as np
import pandas as pd
# display is a notebook function FYI
def rand5(): ## random uniform int [1...5]
    return np.random.randint(1,6)

n_trials = 1000
samples = [rand5() for _ in range(n_trials)]

display(pd.Series(samples).value_counts(normalize=True))
# 4    0.2042
# 5    0.2041
# 2    0.2010
# 1    0.1981
# 3    0.1926
# dtype: float64
    
def rand7(): # magic algebra
    x = sum(rand5() for _ in range(7))
    return x%7 + 1

samples = [rand7() for _ in range(n_trials)]

display(pd.Series(samples).value_counts(normalize=False))
# 6    1475
# 2    1475
# 3    1456
# 1    1423
# 7    1419
# 4    1393
# 5    1359
# dtype: int64
    
df = pd.DataFrame([
    pd.Series([rand7() for _ in range(n_trials)]).value_counts(normalize=True)
    for _ in range(1000)
])
df.describe()
#      1    2   3   4   5   6   7
# count 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000
# mean  0.142885    0.142928    0.142523    0.142266    0.142704    0.143048    0.143646
# std   0.010807    0.011526    0.010966    0.011223    0.011052    0.010983    0.011153
# min   0.112000    0.108000    0.101000    0.110000    0.100000    0.109000    0.110000
# 25%   0.135000    0.135000    0.135000    0.135000    0.135000    0.135000    0.136000
# 50%   0.143000    0.142000    0.143000    0.142000    0.143000    0.142000    0.143000
# 75%   0.151000    0.151000    0.150000    0.150000    0.150000    0.150000    0.151000
# max   0.174000    0.181000    0.175000    0.178000    0.189000    0.176000    0.179000

Bite code · Answer 70 · 2009-04-30T17:52:27.080

-1

I feel stupid in front of all this complicated answsers.

Why can't it be :

int random1_to_7()
{
  return (random1_to_5() * 7) / 5;  
}

?

edited Apr 30 '09 at 17:52

answered Apr 30 '09 at 17:46

Bite code

578,959
113
301
329

1

Test this - it doesn't work. It won't provide an even distribution across all 7 numbers. – Jon Tackabury Apr 30 '09 at 18:25
6

This would work if we were interested in real numbers, but since we're dealing with ints, that code will only produce 1, 2, 4, 5, or 7, and never 3 or 6. – ESRogs Apr 30 '09 at 19:00

score -2 · Answer 71 · answered May 12 '15 at 11:17

-2

def rand5():
    return random.randint(1,5)    #return random integers from 1 to 5

def rand7():
    rand = rand5()+rand5()-1
    if rand > 7:                  #if numbers > 7, call rand7() again
        return rand7()
    print rand%7 + 1

I guess this will the easiest solution but everywhere people have suggested 5*rand5() + rand5() - 5 like in http://www.geeksforgeeks.org/generate-integer-from-1-to-7-with-equal-probability/. Can someone explain what is wrong with rand5()+rand5()-1

answered May 12 '15 at 11:17

user2622350

53
7

This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient [reputation](http://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](http://stackoverflow.com/help/privileges/comment). – UmNyobe May 12 '15 at 12:04
This gives an answer also as well as ask question about what is wrong with my approach. thnx – user2622350 May 12 '15 at 12:23
2

Don't ask questions in answers. Your problem is that the probabilities are not uniform. For example you are twice as likely to obtain 2 than 1 using rand5()+rand5()-1. Indeed `1 = (rand5 = 1) + (rand5 = 1) - 1` when `2 = (rand5 = 1) + (rand5 = 2) -1` or `(rand5 = 2) + ( rand5 = 1) -1` – UmNyobe May 12 '15 at 13:55
This approach might not be suitable, as it runs longer than needed: if you used a flawed `rand5`which alway returns 4 or more, this algorithm runs forever – Nico Haase Apr 17 '20 at 11:22

score -2 · Answer 72 · answered May 17 '19 at 10:49

// returns random number between 0-5 with equal probability
function rand5() {
  return Math.floor(Math.random() * 6);
}

// returns random number between 0-7 with equal probability
function rand7() {
  if(rand5() % 2 == 0 && rand5() % 2 == 0) { 
    return 6 + rand5() % 2;
  } else {
    return rand5();
  }
}

console.log(rand7());

Patrick Hogan · Answer 73 · 2009-04-30T21:12:53.760

-2

A constant time solution that produces approximately uniform distribution. ~~The trick is 625 happens to be cleanly divisible by 7 and you can get uniform distributions as you build up to that range.~~

Edit: My bad, I miscalculated, but instead of pulling it I'll leave it in case someone finds it useful/entertaining. It does actually work after all... :)

int rand5()
{
    return (rand() % 5) + 1;
}

int rand25()
{ 
    return (5 * (rand5() - 1) + rand5());
}

int rand625()
{
    return (25 * (rand25() - 1) + rand25());
}

int rand7()
{
    return ((625 * (rand625() - 1) + rand625()) - 1) % 7 + 1;
}

edited Apr 30 '09 at 21:12

answered Apr 30 '09 at 15:14

Patrick Hogan

2,098
4
20
28

6

"625 happens to be cleanly divisible by 7" - guess again. 625 = 5^4 is not divisible by 7. – Steve Jessop Apr 30 '09 at 16:32
Thanks, you are quite correct. Apple's calculator lied to me (or rather I forgot it doesn't have decimals in "programmer" mode). – Patrick Hogan Apr 30 '09 at 21:16

score -3 · Answer 74 · answered Apr 30 '09 at 17:43

-3

int rand7()
{
    int zero_one_or_two = ( rand5() + rand5() - 1 ) % 3 ;
    return rand5() + zero_one_or_two ;
}

answered Apr 30 '09 at 17:43

mangokun

5,493
2
15
8

score -3 · Answer 75 · answered Apr 30 '09 at 17:59

-3

#!/usr/bin/env ruby
class Integer
  def rand7
    rand(6)+1
  end
end

def rand5
  rand(4)+1
end

x = rand5() # x => int between 1 and 5

y = x.rand7() # y => int between 1 and 7

..although that may possibly be considered cheating..

answered Apr 30 '09 at 17:59

dbr

165,801
69
278
343

score -4 · Answer 76 · edited Apr 30 '09 at 18:01

-4

solution in php

<?php
function random_5(){
    return rand(1,5);
}


function random_7(){
 $total = 0;

    for($i=0;$i<7;$i++){
        $total += random_5();
    }

    return ($total%7)+1; 
}

echo random_7();
?>

edited Apr 30 '09 at 18:01

dbr

165,801
69
278
343

answered Nov 27 '08 at 20:21

score -4 · Answer 77 · answered Aug 24 '18 at 01:43

-4

This is the answer I came up with but these complicated answers are making me think this is completely off/ :))

import random

def rand5():
    return float(random.randint(0,5))

def rand7():
    random_val = rand5()
    return float(random.randint((random_val-random_val),7))

print rand7()

answered Aug 24 '18 at 01:43

Hassan

61
3
12

3

rand7 should use ONLY rand5. (cannot use random.randint) – Seb Aug 24 '18 at 08:26
1

`random_val-random_val` is just a fancy way to spell `0`, so in the second line of `rand7` you're simply doing `return float(random.randint(0, 7))`. Not only is that cheating as @Sebastian pointed out, it also isn't a solution to the problem, because it's returning integers in the range 0-7, not integers in the range 1-7 as specified. – Mark Dickinson Aug 24 '18 at 18:08

score -4 · Answer 78 · answered May 03 '09 at 10:22

I have played around and I write "testing environment" for this Rand(7) algorithm. For example if you want to try what distribution gives your algorithm or how much iterations takes to generate all distinct random values (for Rand(7) 1-7), you can use it.

My core algorithm is this:

return (Rand5() + Rand5()) % 7 + 1;

Well is no less uniformly distributed then Adam Rosenfield's one. (which I included in my snippet code)

private static int Rand7WithRand5()
{
    //PUT YOU FAVOURITE ALGORITHM HERE//

    //1. Stackoverflow winner
    int i;
    do
    {
        i = 5 * (Rand5() - 1) + Rand5(); // i is now uniformly random between 1 and 25
    } while (i > 21);
    // i is now uniformly random between 1 and 21
    return i % 7 + 1;

    //My 2 cents
    //return (Rand5() + Rand5()) % 7 + 1;
}

This "testing environment" can take any Rand(n) algorithm and test and evaluate it (distribution and speed). Just put your code into the "Rand7WithRand5" method and run the snippet.

Few observations:

Adam Rosenfield's algorithm is no better distributed then, for example, mine. Anyway, both algorithms distribution is horrible.
Native Rand7 (random.Next(1, 8)) is completed as it generated all members in given interval in around 200+ iterations, Rand7WithRand5 algorithms take order of 10k (around 30-70k)
Real challenge is not to write a method to generate Rand(7) from Rand(5), but it generate values more or less uniformly distributed.

No, your algorithm does not product a uniform distribution. It produces 1..7 with probabilities 4/25, 3/25, 3/25, 3/25, 3/25, 4/25, 5/25, as can easily be verified by counting all 25 possible outcomes. 25 is not divisible by 7. Your test for uniformity is also flawed -- the number of trials needed to get every number has a complicated distribution, see http://is.gd/wntB . You need to perform your test thousands of times, not once. A better test would be to call the RNG thousands of times and compare the number of occurrences of each outcome. — Adam Rosenfield, May 03 '09 at 15:53

Expand a random range from 1–5 to 1–7

78 Answers78

Method 1 - Exact

Method 2 - Not Exact

Method 3 - Exact

Method 4 - Not Exact

Explanation

Result

Simplified Answer

Linked

Related