1

I was rewriting this function:

long rng(long low, long high)
{
    return low + long((high - low + 1) * double(rand() / double(RAND_MAX + 1.0)));
}

Updating it to C++11 and Mersenne Twister.

long rng(long low, long high)
{
    static std::mt19937 rnumber; // Mersenne Twister
    std::uniform_int_distribution<unsigned> u (low, high);
    return u(rnumber);
}

Now this seems to work, but my C++11 book states that in similar cases BOTH the engine and the distributor should be "static". The problem is that I want to call the function with different ranges, and if I make the distributor static then the range is always the same.

So how should this be handled in a proper way?

Two other questions:

  • Why Stroustrup book seems to suggest I should use {low, high} instead of (low, high)? What's the difference?

  • Is is plausible that with gcc 4.8.1 and no optimization rand is twice as fast as Mersenne, whereas with -O3 rand has the same performance, while Mersenne becomes even actually faster than rand?

Abalieno
  • 159
  • 1
  • 1
  • 5
  • 1
    `{}` is "uniform initialization" and Stroustrup suggests it because it will lead to less confusion when used everywhere. – leemes Dec 29 '13 at 17:10
  • Try making it non-static and see what happens! – Oliver Charlesworth Dec 29 '13 at 17:12
  • 1
    @OliCharlesworth I guess he knows that the engine has to be static. He wonders why the distribution should be, too. Because in his case it can't. – leemes Dec 29 '13 at 17:13
  • 1
    Related: http://stackoverflow.com/a/19036349/592323 – leemes Dec 29 '13 at 17:14
  • 2
    I think that your real problem is the absence of a seed. – user2485710 Dec 29 '13 at 17:14
  • 2
    Also, please use `` for the distribution. It should name the type to be returned; in your case `long`, not `unsigned`. – leemes Dec 29 '13 at 17:15
  • Yes, that function works. I'm just curious about the actual difference having it static or not, and if there's a better method to organize all this. About using I'll probably make everything unsigned and stick to the standard... – Abalieno Dec 29 '13 at 17:18
  • @Abalieno priming a mersenne algorithm isn't cheap, since the seed affects it.(it uses a default, but still has to prime) Honestly, you need both a static mt-obj *and* a `std::random_device` with which to seed it. How you do the latter is up to you. – WhozCraig Dec 29 '13 at 17:36

3 Answers3

2

Distributions are allowed to calculate more than one random number when you use operator(), as long as the number of calls of g.operator() is amortized constant.

Therefore there is a significant difference between using a static distribution and a non-static one:

#include <iostream>
#include <random>

template <class Generator>
double use_rand(Generator & g){
    std::normal_distribution<double> d;
    return d(g);
}

template <class Generator>
double use_rand_static(Generator & g){
    static std::normal_distribution<double> d;
    return d(g);
}

int main(){
    std::mt19937 rnumber{5};
    std::cout << "non-static version:\n";
    std::cout << use_rand(rnumber) << "\n";
    std::cout << use_rand(rnumber) << "\n\n";

    rnumber.seed(5);
    std::cout << "static version:\n";
    std::cout << use_rand_static(rnumber) << "\n";
    std::cout << use_rand_static(rnumber) << std::endl;
    return 0;
}
non-static version:
0.107794
-0.199707

static version:
0.107794
-0.0306363

As you can see the second values differ. The reason for this is hidden in the implementation of normal_distribution::operator(), it calculates two values every other iteration.

This implementation defined behaviour is actually the reason why you should use the same distribution as long as you want to use values from the same probability space (aka static). However, since you want to change low and high often you're changing the probability space, so it's fine to use a new distribution (creating most distributions is actually pretty cheap, but using a new one for every value can get expensive).

Zeta
  • 103,620
  • 13
  • 194
  • 236
  • I tried to put in a header a number of optional distributors made static. Perfomance-wise, not really making any difference, actually the non static distributor within the function seems faster. Instead I tried to use a 1, 100 range and it makes the function TWICE as fast compared to, say, 1, 5 range. Don't know why. – Abalieno Dec 29 '13 at 21:00
  • 1
    @Abalieno: Downscaling. If you're going to use [1..5], you can use only ~83% of all values in [-2^63 ... 2^63 - 1]. If you're using [1..100] you can use ~99% of all values in the range of your URNG, which makes a _huge_ difference. Note that this difference vanishes if you use a better type for those values, e.g. `unsigned char`. – Zeta Dec 29 '13 at 21:20
  • @Zeta: Why would you only use ~83%? I would think you could use all but `0xFFFFFFFFFFFFFFFF%5 -> 1` value... – Mooing Duck Sep 08 '14 at 17:30
1

The distribution objects are objects which need some initialization. If you can reuse them, it will be more efficient to reuse them than creating them upon every use. Since the typical use of random numbers is to generate many values for the same ranges, the extra constructions may be removed. When replacing uses of rand() with a different random number generator it is probably not viable to pass the objects through. For newly created code it may be reasonable to keep sufficient context around to have a distribution object, too.

With respect to your add on questions:

  1. Using braces instead of parenthesis uses uniform initialization syntax (which is, of course, far from universally applicable). One of the advantages is that it refuses to do narrowing conversions.
  2. The computations in the C++11 random number generators are probably inline. There may be enough scope for optimizations to have gcc do the computations faster with -O3. There is, of course, also a chance that the compiler decides that you don't use the results from the random numbers and elides actually using it.
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
0

You don't have to specify in advance the limits for the distribution.

#include <iostream>
#include <ostream>
#include <random>

unsigned long rng(unsigned long low, unsigned long high)
{
    static std::mt19937 engine((std::random_device()()));
    static std::uniform_int_distribution<unsigned long> dist;

    std::uniform_int_distribution<unsigned long>::param_type p(low, high);

    return dist(engine, p);
}

int main()
{
    const unsigned int n = 10;
    const int m = 24;
    for (unsigned int k = 2; k != n + 1; ++k)
    {
        for (int j = 0; j != m; ++j)
        {
            // Print out 24 numbers from the range [0, k)
            std::cout << rng(0, k - 1) << " ";
        }
        std::cout << std::endl;
    }

    return 0;
}
user515430
  • 3,341
  • 2
  • 17
  • 13