C++ int to float cast approximation causes RNG failure?

Question

I need to generate a high number of random integers in varying ranges [0, N[ where N is a small positive integer N < 1000. I thought that using a uniform_real_distribution and multiplying by N would do it, but I ran into a problem: Very (very very) rarely, it yields N, which is out of bounds.

#include <random>

std::default_random_engine generator;
std::uniform_real_distribution<float> Udistribution(0.0f, 1.0f);  // [0, 1)

int N = 5;
int R;
R = (int) (Udistribution(generator) * (float)N); 
// R = N sometimes !!!
// Also happens when using floor instead of casting to int.

I thought the reason could be that casting N to a float could increase it just enough so that if the random float is something like .9999999999f, their product could reach over 1. But it happened with N = 5 and N = 10, both being perfectly represented in floats. What could be the reason then ? the float product operation's approximation ?

(I'm not using uniform_int_distribution because N is different each time and I though it would be cheaper this way, yeah I know benchmarking is the way. Just asking this out of curiosity.)

If you are working with integers, you want a [`uniform_int_distribution`](https://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution). — NathanOliver, Mar 16 '23 at 12:19
"Most existing implementations have a bug where they may occasionally return b" [std::uniform_real_distribution - cppreference.com](https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution) — MikeCAT, Mar 16 '23 at 12:22
see the note at https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution, if you want integers use an integer distribution, converting from floats is asking for trouble — Alan Birtles, Mar 16 '23 at 12:22
Integer distributions do not have that defect that you describe. — Öö Tiib, Mar 16 '23 at 12:24
*it would be cheaper this way* This way will take longer to run (i.e., slower). What do you mean by "cheaper"? — Eljay, Mar 16 '23 at 12:27
If you think the dynamic `N` is a performance problem, you could generate random integers in `[0, 0xffffffff]` and calculate `((int64) rand * N) >> 32` -- this is not perfectly uniform (unless `N` is a power of two), but cheap to calculate and will never overflow. — chtz, Mar 16 '23 at 12:33
@Eljay I would need to recreate the int distribution after each generation, because N varies. Or at least change the parameters, which maybe require a call to reset() https://stackoverflow.com/questions/28328948/use-stduniform-int-distribution-and-define-its-range-later — Yeb02, Mar 16 '23 at 12:34
You can perfectly change `N`, no need to reset anything [as you can see here](https://godbolt.org/z/ezWcrz4Ex). — Fareanor, Mar 16 '23 at 12:41
@Yeb you may even win in performance as underlying engine is anyway generating integers (conversion from what is causing the very defect) and so you optimize that buggy conversion and your conversion back to int out. — Öö Tiib, Mar 16 '23 at 12:46

C++ int to float cast approximation causes RNG failure?

0 Answers0