2

I have a simple C++ program I was using to demo random selection of an array element with my students.

As usual, I seeded using srand (time (0)); before using rand() to generate a number.

The array has seven entries, and the selected entry was the first - subscript 0.

I have now run versions of the program fifteen times over forty minutes and the first randomly-generated number has been a multiple of seven, for example:

1258771276 1258586399 1229409447 1257140997 1256216612 1260855344 1262973026 1266351233

I changed up the program to select five random numbers after seeding, and (after generating an initial multiple of 7), the succeeding numbers behave better.

I understand that "random" means I can't predict the numbers to be different any more than I can predict them to be the same.

I also understand that these are pseudorandom numbers.

I also understand rand() is not expected to be top-grade cryptographic randomization.

Still, this behavior is so odd I am completely flabbergasted. Have I fallen into some odd crack of the C++ runtime? This is using Xcode on MacOS fully patched to Mojave 10.14.1.

Andrew Wolfe
  • 2,020
  • 19
  • 25
  • 3
    Linear congruential generators like the (usual) implementation of rand() often display artefacts like this - it's why you should not be using them for any serious applications. Another artefact appears to be that the first two digits of the generated numbers are always 12. If you _must_ use rand() then taking the middle digits can often give you "better" values. –  Nov 12 '18 at 17:23
  • Perhaps this https://stackoverflow.com/questions/13445688/how-to-generate-a-random-number-in-c will be of some help. – Michael Surette Nov 12 '18 at 17:35
  • I agree with Neil, and if I can add, the C++ rand() function is incredibly convenient (and probably the easiest to demonstrate as a teacher), but has been superseded by more complete library functions, http://www.cplusplus.com/reference/random/ – Phill Nov 12 '18 at 17:36
  • There are some (I think) interesting comments about this (for Unity/C#, but essentially the same) in [this blog post](https://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/), under "Can’t I just use random number generators with different seed values?". – jdehesa Nov 12 '18 at 17:36
  • [rand() Considered Harmful](https://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful) is worth watching. – Jesper Juhl Nov 12 '18 at 18:18
  • Ideas for an easy to use replacement for simple situations: https://stackoverflow.com/a/50665027/3807729 – Galik Nov 12 '18 at 18:19
  • It would more useful perhaps to save the value of `time(0)` in a variable, and then supply this value as the seed. Then you can print the seed and the rand output and we can do something other than guess what the issue might be. – President James K. Polk Nov 12 '18 at 18:27
  • "rand() Considered Harmful" video is normally considered a poor joke in professional circles. The authors of that video tried to earn some "likes" on cheap hype, but ended up failing rather miserably. `rand()` is NOT considered harmful. – AnT stands with Russia Nov 12 '18 at 18:57
  • 1
    It would be rather strange to see a pseudo-random generator that always generates a multiple of 7 after seeding, but displays no such anomaly for further pseudo-random values in the sequence. After all, a typical `rand` implementation usually does not discern a "freshly seeded" state from state "inherited from previous `rand()` call". If you observe something like that I would suspect some anomaly in the seed value, not in the generator implementation. However, it is still possible that a bug in `rand()` implementation is causing this. – AnT stands with Russia Nov 12 '18 at 19:00
  • 1
    @AnT: unfortunately, this is the case for this generator. The first number it generates are almost always have the same value (mod 16807). And 16807 is evenly divisible by 7. That's why OP experiences this. – geza Nov 12 '18 at 19:12
  • @geza: Sorry, but this makes no sense. Again, this generator cannot discern "first number" from "non-first number". Taking this into account, how can you possibly make any statements about "first number it generates"??? – AnT stands with Russia Nov 12 '18 at 19:18
  • @AnT: What it means, that if you seed the generator with similar numbers, then the first output will be the same (mod 16807) with a high chance. But yes, of course, it doesn't know anything about the first number. But, if the internal state is similar (`ctx`), then the next number will likely be the same (mod 16807). Even the second number has a high chance that subtracting it from the first result will give the same value (mod 16807). Just take a little time to analyze it. – geza Nov 12 '18 at 19:22
  • @geza: But the OP's claim is very different. The OP states that the first number they see is always 0 (mod 7). That is a very different claim. – AnT stands with Russia Nov 12 '18 at 19:27
  • It's because of the sh\*tty `rand` implementation in macOS. Duplicates: [Why does rand() % 7 always return 0?](https://stackoverflow.com/q/7866754/995714), [rand function is giving me the same result at each run even when I called srand(time(NULL))](https://stackoverflow.com/q/66808250/995714), [Rand() % 14 only generates the values 6 or 13](https://stackoverflow.com/q/20263187/995714) – phuclv Aug 27 '21 at 04:49
  • Does this answer your question? [Why does rand() % 7 always return 0?](https://stackoverflow.com/questions/7866754/why-does-rand-7-always-return-0) – phuclv Aug 27 '21 at 04:49

1 Answers1

1

Here's the source code of the random generator in question:

static int
do_rand(unsigned long *ctx)
{
/*
 * Compute x = (7^5 * x) mod (2^31 - 1)
 * without overflowing 31 bits:
 *      (2^31 - 1) = 127773 * (7^5) + 2836
 * From "Random number generators: good ones are hard to find",
 * Park and Miller, Communications of the ACM, vol. 31, no. 10,
 * October 1988, p. 1195.
 */
    long hi, lo, x;

    /* Can't be initialized with 0, so use another value. */
    if (*ctx == 0)
        *ctx = 123459876;
    hi = *ctx / 127773;
    lo = *ctx % 127773;
    x = 16807 * lo - 2836 * hi;
    if (x < 0)
        x += 0x7fffffff;
    return ((*ctx = x) % ((unsigned long)0x7fffffff + 1));
}

This is not a good quality generator considering the current state-of-the-art generators. However, it is not that bad either.

But, its seeding mechanism is the worst possible: it just sets ctx as the seed. So, if you set the seed to similar values (just like your example, you seed it with time, it means it will have similar values), the next number it generates will have a really strong correlation to the seed.

One solution to the problem is to generate "some" numbers after seeding. It won't solve the problem perfectly, however (it is usual that the seed has a strong correlation to the generated number, even at very long distances).

Or, the better solution is to forget about rand(), and use the more modern (and a little bit harder to use) random, which is in #include <random>. Check out cppreference's documentation about it.

geza
  • 28,403
  • 6
  • 61
  • 135
  • A quick experiment shows that this RNG produces first result divisible by 7 for about 14.3% of seed values, as expected. This is perfectly normal and not even close to the OP's claim. – AnT stands with Russia Nov 12 '18 at 19:15
  • 1
    @AnT: what is not normal, that if the seeds values are close to each other, then the first result will be the same (mod 16807). – geza Nov 12 '18 at 19:18
  • First result being the same (mod 16807) is not the same thing as first result being divisible by 7. – AnT stands with Russia Nov 12 '18 at 19:25
  • 1
    @AnT: At the time OP tried this, it meant the same. This generator gives the same value for 127773 seeds next to each other (mod 16807). It means 1.5 days (if we count this in seconds). It means, that for 1.5 days, seeding with `time()`, it will give the same first number (mod 16807). Or the same number (mod 7), as 16807 is evenly divisible by 7. For the OP, this same number happened be zero. But if OP tries this tomorrow, OP may get 2. For 1.5 days. – geza Nov 12 '18 at 19:30
  • Apparently you are right. But that's just another reason to be more careful with the seeding technique as opposed to just using the popular `srand(time(0))` (assuming the application requires anything more careful). In most applications it wouldn't matter though. Frankly, I'm having hard time coming up with an application that would be satisfied with an LCG generator on the on hand, and yet concerned about such seeding behavior on the other hand. – AnT stands with Russia Nov 12 '18 at 19:46