1

I'm using rand() for two ints, between 0 and 2. It appears that the first int is never 0 while the second int is 2. Here is my test code-

#include <iostream>
#include <time.h>

int main()
{
    srand(time(NULL));
    int number1, number2;
    number1 = rand() % 3;
    number2 = rand() % 3;

    printf("%i, %i", number1, number2);
    return 0;
}

Output-25 tries

2, 2
2, 2
2, 1
1, 2
0, 1
2, 2
1, 2
1, 0
2, 1
1, 0
0, 0
1, 2
2, 2
0, 0
2, 1
1, 0
2, 2
1, 0
2, 1
1, 0
0, 1
1, 2
1, 0
0, 0
2, 2

As you can see, out of 25 tries, the combo was never 0, 2. Is this the sign that I should probably move over to < random >? In addition, there is never 2, 0.

4 Answers4

3

No, this will happen for 9*(1-1/9)^25 = 0.4736 of all seeds, or roughly 50% of the time. That is, some two digit sequence with digits in {0,1,2} will be missing from your first 25 results roughly half the times you run your program.

Run it again and see what happens.

Michael Graczyk
  • 4,905
  • 2
  • 22
  • 34
  • Running the above code in an online compiler gives the combos, but I never get them in my compiler. – CitricThunder Oct 13 '14 at 01:08
  • 2
    Q: What is your compiler? Q: Define "never"? 25 times? 50 times? 1000 times? – FoggyDay Oct 13 '14 at 01:15
  • I'm using Visual Studio 2013. I've run it about 150 times without a for loop, since the for loop results in the combos. But without a for loop, I haven't seen them in 150 tries. – CitricThunder Oct 13 '14 at 01:17
2
  1. You should definitely use <random>. The sooner you forget about rand's existence, the happier you will be.

  2. rand is sometimes implemented with a linear congruential generator (LCG). LCGs suffer from a number of defects, the most relevant being that that the low-order bits of sequentially generated numbers are highly correlated. For this reason, you should never use rand() % k to generate numbers in the range [0, k). There are other reasons. In fact, generating unbiased random integers from a restricted range involves some subtleties, which <random> handles for you.

  3. srand(time(NULL)) seeds the random number generator to the current time in seconds from epoch, which means that if you run the program several times in sequence, the seeds will be either the same or similar. If the seeds are the same, the random number sequence will also be the same. If the seeds are similar, the random number sequences may also be similar. So don't do this except in long-running programs. Finding a good seed for a pseudo random number generator can be tricky. <random> implementations will have better default behaviour, so you won't normally need to worry about this.

rici
  • 234,347
  • 28
  • 237
  • 341
  • +1. Also, Eric Lippert has a [good post](http://ericlippert.com/2013/12/16/how-much-bias-is-introduced-by-the-remainder-technique/) on why the remainder technique introduces bias. – Ilian Oct 13 '14 at 01:44
  • 1
    @IlianPinzon: Or you could watch stl's preso on channel9: http://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful – rici Oct 13 '14 at 01:46
1

It is. This code gave me 0,2 pair just at first run:

for( int i = 0; i < 20; ++i) {
  number1 = rand() % 3;
  number2 = rand() % 3;
  printf("%i, %i\n", number1, number2);
}

Generating truly random number from uniform distribution doesn't guarantee that given (possible) number will appear in limited number of trials. The K2 out of four BSI criteria of good PRNG is

K2 — A sequence of numbers which is indistinguishable from 'true random' numbers according to specified statistical tests.

thus generating pseudo random numbers ofcourse tends to behave in the same way as sampling from true random distribution - though because of limitations any (possible) number will appear at some point (at time less or equal its period).

http://ideone.com/c5oRQL


Use std::uniform_int_distribution

Apart from the above rand() is not the best generator. It introduces bias always whenever the divisor in modulo operation doesn't evenly divides the range of PRNG. Operator % makes the probability distribution produced in this way skewed because RAND_MAX which is maximum value for rand() can be not equal to k * 3 + 2. If divisor not evenly divides the range then distribution will be skewed and the bias increases with divisor. You can read here more on this. Summing it up: in C++ you should use <random> library:

#include <iostream>
#include <random>

int main()
{
    std::random_device rd;
    std::mt19937 gen( rd());
    std::uniform_int_distribution<> dis( 0, 2);

    for ( int n = 0; n < 25; ++n)
        std::cout << dis(gen) << ' ';
    std::cout << '\n';
}
Community
  • 1
  • 1
4pie0
  • 29,204
  • 9
  • 82
  • 118
1

Taking % 3 does not depend on just lower order bits.

I ran the program below using VC++ simulating running the OP's program ten million times with one second between invocations. It shows no bias.

start = 1413167398
(0, 0) 1110545
(0, 1) 1111285
(0, 2) 1111611
(1, 0) 1111317
(1, 1) 1111666
(1, 2) 1110451
(2, 0) 1111580
(2, 1) 1110491
(2, 2) 1111054

#include <cstdlib>
#include <ctime>
#include <iostream>
#include <map>
#include <utility>

int main()
{
    std::map<std::pair<int, int>, int> counter;

    unsigned int start = static_cast<unsigned int>(std::time(nullptr));
    std::cout << "start = " << start << std::endl;
    unsigned int finish = start + 10000000;
    for (unsigned int seed = start; seed != finish; ++seed)
    {
        std::srand(seed);
        int x = rand() % 3;
        int y = rand() % 3;

        ++counter[std::make_pair(x, y)];
    }

    for (auto iter = counter.cbegin(); iter != counter.cend(); ++iter)
    {
        std::cout << "(" << iter->first.first << ", " << iter->first.second << ") ";
        std::cout << iter->second << std::endl;
    }

    return 0;
}
user515430
  • 298
  • 1
  • 3
  • 7