5

I generate many many random numbers that need to be between 1 and 15 (included) in C++. Of course, I can generate zillons of std::uniform_int_distribution<std::mt19937::result_type> random(1, 15); but this is a waste since this mersenn twister generates 32 bits (or even 64 using mt19937_64) of random values, and I would only keep 4 bits and throw away all the rest, and in my case, performance is an issue and random number generation is a significant contributor.

My idea was thus to generate for example a single 64-bit random value between 0 and 2^64-1, and select 4 bits among them. The issue is that I can't find a way to have the generated values between 1 and 15. Example:

unsigned long long int r = uniform(generator); // between 0 and 2^64-1
unsigned int r1 = (r+1)&15;     // first desired random value
unsigned int r2 = ((r>>4)+1)&15; //second desired random value 
unsigned int r3 = ((r>>8)+1)&15; //third desired random value 
...

Here, this version of course doesn't work : despite the +1, the generated values are still between 0 and 15 (since if r&15 happens to be 0xb1111 then adding 1 produces the result 0xb0000).

Also, I would like the distribution to remain uniform (for instance, I wouldn't want to bias the least significant bit to occur more often, which could be the case with something like (r&15+1)|((r&15 +1) >> 4) since the value 0xb0001 would occur twice often).

nbonneel
  • 3,286
  • 4
  • 29
  • 39
  • 1
    You can still produce random number between [0-15*15*15*15[ and using `r%15 + 1`, `r/15 % 15 + 1`, `r / 15 / 15 % 15 + 1` and `r / 15 / 15 / 15 % 15 + 1`. – Jarod42 Jul 10 '19 at 23:23
  • if it is possible to avoid divisions and modulo I would prefer .. (they are quite costly : I'm not even sure if generating four 32bit random number would be less costly than this solution with 6 divisions and 4 modulos). I'd certainly prefer bitwise operations... but for sure, if that's even remotely possible! – nbonneel Jul 10 '19 at 23:28
  • With constant folding, there is actually only 3 divisions. and last modulo is superfluous too. Alternatively, there is function to compute `r/15` and `r%15` at once too. – Jarod42 Jul 10 '19 at 23:31
  • 1
    You can simply throw away any zero values. This will get you, on average, about 15 random values from each 64-bit random input. – David Schwartz Jul 10 '19 at 23:53
  • division is slow but division by a constant can be converted to a multiplication so it'll be fast. See [Why does GCC use multiplication by a strange number in implementing integer division?](https://stackoverflow.com/q/41183935/995714) – phuclv Jul 11 '19 at 04:53
  • Distributions don't have to throw away unused bits. The `operator()` is not `const`, and therefore distributions can cache unused bits and use them on the next number generation – KABoissonneault Jul 12 '19 at 14:59

2 Answers2

4

Instead of

std::mt19937 gen(seed);
std::uniform_int_distribution<> dis(1, 15);

auto r1 = dis(gen);
auto r2 = dis(gen);
auto r3 = dis(gen);
auto r4 = dis(gen);

You might do:

std::mt19937 gen(seed);
std::uniform_int_distribution<> dis(0, 15 * 15 * 15 * 15 - 1); // Assuming int at least 16 bits

auto r = dis(gen);

auto r1 = r % 15 + 1; r /= 15;
auto r2 = r % 15 + 1; r /= 15;
auto r3 = r % 15 + 1; r /= 15;
auto r4 = r + 1;

Quick benchmark (second version 2.5 times faster than first one)

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • I also tried it in the meantime and it indeed seems to be faster! thanks ! – nbonneel Jul 10 '19 at 23:52
  • 1
    `15 * 15 * 15 * 15 - 1` is just `50624`, so you're only using 16 bits at a time. You could get 8 values from the range 1..15 from each 32-bit value using `2562890624` as the upper bound. Or you could generate 32-bit values directly, use bit-shifting to extract 4-bit values, and manually discard any 0 values. Does `uniform_int_distribution` necessarily yield just one number from each pseudo-random 32-bit value, or is it smart enough to break it up into multiple values internally? – Keith Thompson Jul 11 '19 at 01:38
1

As mentioned by David Schwartz in the comment section

You can simply throw away any zero values. This will get you, on average, about 15 random values from each 64-bit random input.

Implementing a simple rejection sampling technique for this particular use case instead of using the more general std::uniform_int_distribution should be more efficient (e.g. see this Quick Bench comparing the standard class with the following).

class uniform_1_to_15_distribution
{
    uint64_t value_{};        // So that one could use std::mt19937_64 too
public:
    uniform_1_to_15_distribution() = default;

    template <class Gen> int operator() (Gen &g)
    {
        for (;;)
        {
            // When all the bits have been used (or only useless zeroes remain)
            if( value_ == uint64_t{} )
            {
                // Get a new value from the generator
                value_ = g();
                continue;
            }

            // Pick the 4 LS bits
            int ret = value_ & 0xF;

            // "Consume" the state
            value_ >>= 4;

            // Reject 0. Only value in range [1, 15] are accepted.
            if ( ret == 0 )
            {
                continue;
            }        
            return ret;
        }
    }
};

The distribution can be tested HERE.

Bob__
  • 12,361
  • 3
  • 28
  • 42