randomized array of numbers in wont pass integer range?

Question

long long x[1000000], n, small, big;
    main(){
        cin >> n; //range
        cin >> small; cin >> big;
        srand(time(NULL));
        for (int i=0; i<n; i++){
            x[i]=small + rand()%(big-small+1);
        }
        merge_sort(x,0,n-1);
        for (int i=0; i<n; i++){
            cout << x[i] << " ";
        }
    }

(ignore the merge_sort, i only used it to see the highest number)

So i make sure all the variables is in long long, yet all the randomized x[i] wont pass integer range (32768). Anyone have any idea why?

some example

The return type of `rand` is `int`. ( https://en.cppreference.com/w/cpp/numeric/random/rand ) — Justin, Nov 16 '18 at 18:18
See [RAND_MAX](https://en.cppreference.com/w/cpp/numeric/random/RAND_MAX). One of the many *many* reasons `rand` is terrible. Looks like your implementation supports the minimum required range. — François Andrieux, Nov 16 '18 at 18:18
Related, maybe a duplicate, maybe not (depends on if this is question is an XY problem): https://stackoverflow.com/q/13445688/1896169 — Justin, Nov 16 '18 at 18:20
note that instead of suggesting to ignore `merge_sort` you could simply remove the line from your code if it is not relevant for the question. See also [mcve] — 463035818_is_not_an_ai, Nov 16 '18 at 18:21
@FrançoisAndrieux -- all RNGs have a limited range. That doesn't make them terrible. — Pete Becker, Nov 16 '18 at 18:21
. Many programmers fall into the fallacy that something that isn't perfect is useless. Especially in the area of random number generators, no generator is perfect, but only `rand` gets labelled useless. That's propaganda, not engineering. — Pete Becker, Nov 16 '18 at 19:54
I've removed my side of this conversation, as it's not productive and pollutes this comments section. Feel free to defend `rand` in any of the many questions that ask about why it's considered bad. Here's [one](https://stackoverflow.com/questions/52869166/why-is-the-use-of-rand-considered-bad) . — François Andrieux, Nov 16 '18 at 19:55

score 1 · Answer 1 · answered Nov 17 '18 at 14:33

Like all random-number generators, std::rand() produces numbers in a limited range. All the values are greater than or equal to 0, and all the values are less than or equal to RAND_MAX. The value of RAND_MAX depends on the implementation that you're using. You can look at it to see the upper limit:

std::cout << RAND_MAX << '\n';

There are techniques for creating larger or smaller ranges from the results of a random-number generator. With the newer <random> classes you'd use std::uniform_int to handle the details. With std::rand() you have to roll your own code.

To reduce the range, most people's first instinct is to use the remainder, as the code in the question does. That's okay, sort of, but it can introduce distortions. As an extreme example, suppose RAND_MAX is 32767, and you want to generate values in the range [0 .. 32766]. So you use rand() % 32767, which gives you values in that range. But there's a problem: whenever rand() produces a 0 you get a 0, when it produces a 1 you get a 1, etc., up to 32766. But when rand() produces 32767 you get 0. That is, you'll get 0 when rand() produces 0 and when it produces 32767, but you'll get other values for only one particular result from rand(). If rand() is perfectly uniform (no RNG is), you'll get 0 twice as often as you get any other value. When the range you want to generate is much smaller than the total range that the RNG produces this distortion isn't as large, and might be acceptable.

To do it right you have to discard values. The simplest way to discard values is just to ignore values larger than what you want:

int res = rand();
while (res > my_max_value)
    res = rand();

But if my_max_value is small, this will throw away a lot of values, wasting a lot of time. So a better approach is to combine discarding with taking the remainder:

int max = ((RAND_MAX + 1) / my_max_value) * my_max_value;
int res = rand();
while (res >= max)
    res = rand();
res %= my_max_value;

I haven't done that calculation exactly right; for example, if RAND_MAX is equal to INT_MAX, adding 1 will overflow. Fixing that is left as an exercise for the reader.

max will be the largest multiple of my_max_value that's less than or equal to RAND_MAX. The code discards values that are greater than or equal to that, and when it finds an appropriate value it takes the remainder.

To create a range that's larger than the range of the RNG you "tile" multiple values. For example, if RAND_MAX is 32767 and you need values in the range [0 .. 1073741823] (that upper limit is (32767 << 15) + 32767) you'd call rand() twice, shift one of the results left by 15, and add the other result. If you don't have such a convenient <g> value you have to create a range that's larger than your target range by tiling, then reduce the resulting range as I discussed earlier.

Note that all this stuff has to be done regardless of the RNG that you're using. It's much more convenient in C++11, when all those calculations are encapsulated in std::uniform_int, and that's a good reason for using the C++11 RNGs and distributions.

randomized array of numbers in wont pass integer range?

1 Answers1