4

I have the following function:

typedef unsigned long long int UINT64;
UINT64 getRandom(const UINT64 &begin = 0, const UINT64 &end = 100) {
    return begin >= end ? 0 : begin + (UINT64) ((end - begin)*rand()/(double)RAND_MAX);
};

Whenever I call

getRandom(0, ULLONG_MAX);

or

getRandom(0, LLONG_MAX);

I always get the same value 562967133814800. How can I fix this problem?

  • Your `rand()` function is insufficiently precise for your use case. – Richard Apr 05 '14 at 16:56
  • Given that at least one widespread implementation of the C standard library has 32767 as `RAND_MAX`, I would avoid to spread `rand()` on a 64-bit space... – Matteo Italia Apr 05 '14 at 17:03
  • 1
    In your case, 2x uint32 is as good as a single uint64, you can just generate the random value in two passes as two 32 bit random integers and read as a single 64 bit. Provided you use unsigned as you are supposed to and correct alignment. You can even save yourself the bit shifting. – dtech Apr 05 '14 at 17:15
  • Naturally, feel free to compose as many as your RAND_MAX mandates. In case it is 16bit wide, just use 4 instead. – dtech Apr 05 '14 at 17:28
  • possible duplicate of [Generate random numbers uniformly over an entire range](http://stackoverflow.com/questions/288739/generate-random-numbers-uniformly-over-an-entire-range). See [here](http://stackoverflow.com/a/20136256/493122). – Shoe Apr 05 '14 at 17:28
  • It turns out to be computation overflow. And my post is not about numbers distributation :P – Nguyễn Đức Long Apr 05 '14 at 17:36

8 Answers8

28

What is rand()?

According to this the rand() function returns a value in the range [0,RAND_MAX].

What is RAND_MAX?

According to this, RAND_MAX is "an integral constant expression whose value is the maximum value returned by the rand function. This value is library-dependent, but is guaranteed to be at least 32767 on any standard library implementation."

Precision Is An Issue

You take rand()/(double)RAND_MAX, but you have perhaps only 32767 discrete values to work with. Thus, although you have big numbers, you don't really have more numbers. That could be an issue.

Seeding May Be An Issue

Also, you don't talk about how you are calling the function. Do you run the program once with LLONG_MAX and another time with ULLONG_MAX? In that case, the behaviour you are seeing is because you are implicitly using the same random seed each time. Put another way, each time you run the program it will generate the exact same sequence of random numbers.

How can I seed?

You can use the srand() function like so:

#include <stdlib.h>     /* srand, rand */
#include <time.h>       /* time */

int main (){
  srand (time(NULL));
  //The rest of your program goes here
}

Now you will get a new sequence of random numbers each time you run your program.

Overflow Is An Issue

Consider this part ((end - begin)*rand()/(double)RAND_MAX).

What is (end-begin)? It is LLONG_MAX or ULLONG_MAX these are, by definition, the largest possible values those data types can hold. Therefore, it would be bad to multiply them by anything. Yet you do! You multiply them by rand(), which is non-zero. This will cause an overflow. But we can fix that...

Order of Operations Is An Issue

You then divide them by RAND_MAX. I think you've got your order of operations wrong here. You really meant to say:

((end - begin) * (rand()/(double)RAND_MAX) )

Note the new parantheses! (rand()/(double)RAND_MAX)

Now you are multiplying an integer by a fraction, so you are guaranteed not to overflow. But that introduces a new problem...

Promotion Is An Issue

But there's an even deeper problem. You divide an int by a double. When you do that the int is promoted to a double. A double is a floating-point number which basically means that it sacrifices precision in order to have a big range. That's probably what's biting you. As you get to bigger and bigger numbers both your ullong and your llong end up getting cast to the same value. This could be especially true if you overflowed your data type first (see above).

Uh oh

So, basically, everything about the PRNG you have presented is wrong.

Perhaps this is why John von Neumann said

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.

And, sometimes, we pay for those sins.

How can I absolve myself?

C++11 provides some nice functionality. You can use it as follows

#include <iostream>
#include <random>
#include <limits>
int main(){
  std::random_device rd;     //Get a random seed from the OS entropy device, or whatever
  std::mt19937_64 eng(rd()); //Use the 64-bit Mersenne Twister 19937 generator
                             //and seed it with entropy.

  //Define the distribution, by default it goes from 0 to MAX(unsigned long long)
  //or what have you.
  std::uniform_int_distribution<unsigned long long> distr;

  //Generate random numbers
  for(int n=0; n<40; n++)
    std::cout << distr(eng) << ' ';
  std::cout << std::endl;
}

(Note that appropriately seeding the generator is difficult. This question addresses that.)

Richard
  • 56,349
  • 34
  • 180
  • 251
4
typedef unsigned long long int UINT64;

UINT64 getRandom(UINT64 const& min = 0, UINT64 const& max = 0)
{
    return (((UINT64)(unsigned int)rand() << 32) + (UINT64)(unsigned int)rand()) % (max - min) + min;
}

Using shift operation is unsafe since unsigned long long might be less than 64 bits on some machines. You can use unsigned __int64 instead, but keep in mind it's compiler dependant, therefore is available only in certain compilers.

unsigned __int64 getRandom(unsigned __int64 const& min, unsigned __int64 const& max)
{
    return (((unsigned __int64)(unsigned int)rand() << 32) + (unsigned __int64)(unsigned int)rand()) % (max - min) + min;
}
Ivars
  • 2,375
  • 7
  • 22
  • 31
  • Provided RAND_MAX is 32 bit wide. – dtech Apr 05 '14 at 17:30
  • @ddriver provided int is 32 bit wide and long long 64 bit wide :) – Ivars Apr 05 '14 at 17:33
  • That does not address ddriver's comment, @Ivars. The range of `RAND_MAX` is implementation-dependent and may be only 16 bits wide, in which case your function will not generate numbers with the expected distribution. **Bit-bashing random numbers is dangerous.** Since you do not know who may end up using this code, you should not post something so dangerously wrong. – Richard Apr 05 '14 at 17:37
  • @Richard ok i am ashamed) what would you say if i return `(((unsigned long long)rand() << 48) + ((unsigned long long)rand() << 32) + ((unsigned long long)rand() << 16) + (unsigned __int64)(unsigned int)rand()) % (max - min) + min` – Ivars Apr 05 '14 at 17:46
  • @Richard - I think the width of RAND_MAX depends directly on the width of int. So.. technically it does apply, because if rand max is 16 bit so would be int. – dtech Apr 05 '14 at 18:10
  • @Ivars, I'd say it looks better (though the `(unsigned __int64)(unsigned int)rand()` is rather mysterious), but that the modulus operator introduces a bias in your output (see [this](http://stackoverflow.com/q/10984974/752843)). Since this is a C++ question and we are rapidly moving toward C++11 and, since C++11 has good ways of handling these things, there's really no reason to use such an esoteric method. – Richard Apr 05 '14 at 19:05
  • @Richard there is no mystery.) since `long long` is wide enough to store `long` sign. first addend of this expression will have a sign without casting to `unsigned int`. that would be bad. – Ivars Apr 05 '14 at 19:13
  • @ddriver, every source I can find merely says that the value of `RAND_MAX` is guaranteed to be at least 32767, but that the actual value is _implementation dependent_. That is not the same as being equal to `signed_int_max`, as you suggest. Thinking something is true does not make it so: show me a good source if you want me to believe. Additionally, the length of `int` may vary from machine to machine, so Ivar should not have hard-coded his bit shifts if he intended to present general code. – Richard Apr 05 '14 at 19:13
1

Use your own PRNG that meets your requirements rather than the one provided with rand that seems not to and was never guaranteed to.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
1

Given that ULLONG_MAX and LLONG_MAX are both way bigger than the RAND_MAX value, you will certainly get "less precision than you want".

Other than that, there's 50% chance that your value is below the LLONG_MAX, as it is halfway throuogh the range of 64-bit values.

I would suggest using the Mersenne-Twister from the C++11, which has a 64-bit variant http://www.cplusplus.com/reference/random/mt19937_64/

That should give you a value that fits in a 64-bit number.

If you "always get the same value", then it's because you haven't seeded the random number generator, using for example srand(time(0)) - you should normally only seed once, because this sets the "sequence". If the seed is very similar, e.g. you run the same program twice in short succession, you will still get the same sequence, because "time" only ticks once a second, and even then, doesn't change that much. There are various other ways to seed a random number, but for most purposes, time(0) is reasonably good.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • It is true that `ULLONG_MAX` and `LLONG_MAX` are bigger than `RAND_MAX`. I always get the same result whenever I call the function at that range. Not talking about precision, why would I get the same result that is smaller than the range I want? – Nguyễn Đức Long Apr 05 '14 at 17:19
  • I seed it. Moreover, I call them in a loop. But now, I find out the number overflow while multiplying. – Nguyễn Đức Long Apr 05 '14 at 17:33
1

You are overflowing the computation, in the expression

((end - begin)*rand()/(double)RAND_MAX)

you are telling the compiler to multiply (ULLONG_MAX - 0) * rand() and then divide by RAND_MAX, you should divide by RAND_MAX first, then multiply by rand().

// http://stackoverflow.com/questions/22883840/c-get-random-number-from-0-to-max-long-long-integer
#include <iostream>
#include <stdlib.h>     /* srand, rand */
#include <limits.h>
using std::cout;
using std::endl;

typedef unsigned long long int UINT64;
UINT64 getRandom(const UINT64 &begin = 0, const UINT64 &end = 100) {
    //return begin >= end ? 0 : begin + (UINT64) ((end - begin)*rand()/(double)RAND_MAX);
    return begin >= end ? 0 : begin + (UINT64) rand()*((end - begin)/RAND_MAX);
};

int main( int argc, char *argv[] )
{
    cout << getRandom(0, ULLONG_MAX) << endl;
    cout << getRandom(0, ULLONG_MAX) << endl;
    cout << getRandom(0, ULLONG_MAX) << endl;
    return 0;
}

See it live in Coliru

amdn
  • 11,314
  • 33
  • 45
1
union bigRand {
    uint64_t ll;
    uint32_t ii[2];   
};

uint64_t rand64() {
    bigRand b;
    b.ii[0] =  rand();
    b.ii[1] =  rand();
    return b.ll;
}

I am not sure how portable it is. But you could easily modify it depending on how wide RAND_MAX is on the particular platform. As an upside, it is brutally simple. I mean the compiler will likely optimize it to be quite efficient, without extra arithmetic whatsoever. Just the cost of calling rand twice.

dtech
  • 47,916
  • 17
  • 112
  • 190
0

The most reasonable solution would be to use C++11's <random>, mt19937_64 would do.

Alternativelly you might try:

return ((double)rand() / ((double)RAND_MAX + 1.0)) * (end - begin + 1) + begin;

to produce numbers in more reasonable way. However note that just like your first attempt, this will still not be producing uniformly distributed numbers (although it might be good enough).

LihO
  • 41,190
  • 11
  • 99
  • 167
0

The term (end - begin)*rand() seems produce an overflow. You can alleviate that problem by using (end - begin) * (rand()/(double)RAND_MAX). Using the second way, I get the following results:

15498727792227194880
7275080918072332288
14445630964995612672
14728618955737210880

with the following calls:

std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
R Sahu
  • 204,454
  • 14
  • 159
  • 270