6

I'm trying to understand a C program which includes a .h file with the line

#define random                  ((float) rand() / (float)((1 << 31) - 1))

The C program also includes <math.h>.

My guess is that this simply produces a random number from a uniform distribution on the interval [0,1]; is this correct?

Kurt Peek
  • 52,165
  • 91
  • 301
  • 526
  • Undefined behaviour on all platforms with `int` having no more than 32 bits). Anything can happen, case closed - next please! – too honest for this site Aug 04 '16 at 15:41
  • 1
    Possible duplicate of [Is 1<<31 well defined in C when sizeof(int)==4](https://stackoverflow.com/questions/45268467/is-131-well-defined-in-c-when-sizeofint-4) – Bo Persson Jul 24 '17 at 08:41

3 Answers3

10

Ostensibly yes. But it's wrong in two principal ways:

  1. Use RAND_MAX instead. That's what it's there for. It might be much smaller than 1 << 31 - 1.

  2. 1 << 31 will give you undefined behaviour on a platform with a 32 bit int or less, which is remarkably common. Don't do that!

Note that if you don't want to ever recover the value 1, (as is often the case), then use RAND_MAX + 1.0 on the denominator. The 1.0 forces evaluation in floating point: you run the risk of overflowing an integral type if you write RAND_MAX + 1.

  • It would also invoke UB on platforms with more than 16, but less than 33 bits. And `RAND_MAX + 1` very likely invokes UB on all platforms. – too honest for this site Aug 04 '16 at 15:43
  • If you don't mind I've edited the answer. Both are very good points. Thank you. – Fitzwilliam Bennet-Darcy Aug 04 '16 at 15:43
  • But isn't `RAND_MAX + 1.0` a `double` type? – Fitzwilliam Bennet-Darcy Aug 04 '16 at 15:48
  • Hmm, good point. I don't use floating point in my applications (it is most times a very bad idea on embedded systems). You are right. If you ignore the potential performance issue, you can use it. Still the standard random functions don't guarantee any quality of the algorithm. If a true uniform distribution is required, the algorithm should be choosen carefully and - if really - a floationg point range is required, one should use that directly. – too honest for this site Aug 04 '16 at 15:54
5

The rand function returns a value between 0 and RAND_MAX. This program is assuming that RAND_MAX is 2^31 - 1 and is dividing the result by that number.

So yes, if the above assumption is true, then this macro gives a number from [0,1]. It is not so much a uniform random distribution, but rather a "pseudo-random" value.

At least that's what it's supposed to do. This expression (1 << 31) invokes undefined behavior (assuming int is 32-bit or less) because the the constant 1 has type int and left-shifting by 31 puts it outside the range of int. In practice, if two's complement representation is used, some compilers will allow this shift and then the subsequent -1 will put it back in range, but that can't be depended on.

This undefined behavior can be avoided by using (1U << 31), which makes the constant 1 have type unsigned int so that the shift it within range. A better option is to forget the shift and subtract and just use 0x7fffffff.

But for maximum portability, it should be defined as follows:

#define random ((float)rand() / RAND_MAX)

But there's still a problem. A float typically has a 23 bit mantissa. If rand returns a 32 bit value, you won't get a good distribution of numbers. Better use use double, which has a 52 bit mantissa:

#define random ((double)rand() / RAND_MAX)
dbush
  • 205,898
  • 23
  • 218
  • 273
1

My guess is that this simply produces a random number from a uniform distribution on the interval [0,1]; is this correct?

No. Code may never return 1.0f nor uniform results.

#define random ((float) rand() / (float)((1 << 31) - 1)) has lots of issues. It is weak code.


Lost of precision: Typical float has about 24 bit of precision. Converting the result of rand(), if it exceeds 24 bits, results to a float which may differ in value from the original due to rounding. This weakens/destroys the uniformity of the random number generation. Different rand() results will come up with the same answer. See also @Olaf

A fix for this is problematic as OP apparently wants a uniform random number from the set [0, 1/2,147,483,648, 2/2,147,483,648, ... 2,147,483,647/2,147,483,648] which is not possible given the likely precision limits of float.


The worst, is that (1 << 31) is undefined behavior UB unless int is at least 33 bits long. Shifting a 1 into the sign position is UB. C11dr §6.5.7 4.

To avoid UB, use ((1ul << 31) - 1).

Yet using the magic number ((1ul << 31) - 1) is not as robust as basing the fraction on RAND_MAX.

Further (float) ((1ul << 31) - 1) may suffer from loss of precision as described above as it forms the value 2147483648.0f and not the unobtainable 2147483647.0f. OP's code may never generate a 1.0f.


I suspect OP really needs a [0..1) result and not [0..1]. Both are below.

// generate a `double` in the range [0 ... 1)
#define random0to_almost1  (rand() / (RAND_MAX + 1.0))
// or 
// generate a `double` in the range [0 ... 1]
#define random0to1  (rand() / (RAND_MAX + 0.0))

Note that this suffers like OP's original code should the precision of double (typical 53 bits) is exceeded by RAND_MAX needs.

To cope, one mitigating step is to insure RAND_MAX + 1.0 is done exactly. In the extremely common, but not C specified, RAND_MAX is a power-of-2_minus_1. So RAND_MAX/2 + 1 is an int and an exact power of 2. Conversion of that int to double is certainly exact.

#define random0to_almost1  (rand() / (2.0*(RAND_MAX/2 + 1)))

A float solution would be

// This value is platform dependent, but very common
// Do not a a highly portable generation method yet.
#define FLT_POWER2_INTEGER_LIMIT (1ul << 24)


#define random0to_almost1  ( \
    (rand() % FLT_POWER2_INTEGER_LIMIT) /  \
    (RAND_MAX % FLT_POWER2_INTEGER_LIMIT + 1.0f) \
    )
Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Conversion of a power of 2 `int` value to `double` might not be exact if `FLT_RADIX` is not a power of 2. – Ian Abbott Aug 04 '16 at 18:38
  • @Ian Abbott Yes- agreed. C allows a `FLT_RADIX` of `2,3,4,5...` I have yet to come across a `FLT_RADIX` other than `2,10,16`- of course the future may be different. Certainly values of `2,4,8,16,..` are not an issue with the above answer. With base 10, using the above method still has benefit, but may not reach _exactness_. Should you know of any non-base 2 systems in use today, would be most interested in hearing of them. – chux - Reinstate Monica Aug 04 '16 at 19:44