2

I was hoping to learning how to generate numbers from normal distribution in C++ when I saw This Post. It gives a very good example, but still I am not sure what the & in boost::variate_generator<boost::mt19937&, boost::normal_distribution<> > var_nor(rng, nd); means. What effect will it produce if I did not include this & here?

Also, when reading the tutorial on Boost's official website, I found that after generating a distribution object with boost::random::uniform_int_distribution<> dist(1, 6), they were able to directly generate random numbers with it by calling dist(gen)(gen here is the random engine), without invoking the "variate_generator" object. Of course, this is for generating uniform random numbers, but I am curious if I can do the same with normal distribution, as an alternative way to calling "variate_generator"?

Community
  • 1
  • 1
Vokram
  • 2,097
  • 4
  • 19
  • 27

1 Answers1

4

Short background information

One approach to generate random numbers with a specific distribution, is to generate uniformly distributed random numbers from the interval [0, 1), for example, and then apply some maths on these numbers to shape them into the desired distribution. So you have two objects: one generator for random numbers from [0, 1) and one distribution object, which takes uniformly distributed random numbers and spits out random numbers in the desired (e.g. the normal) distribution.

Why passing the generator by reference

The var_nor object in your code couples the generator rnd with the normal distribution nd. You have to pass your generator via reference, which is the & in the template argument. This is really essential, because the random number generator has an internal state from which it computes the next (pseudo-)random number. If you would not pass the generator via reference, you would create a copy of it and this might lead to code, which always creates the same random number. See this blog post as an example.

Why the variate_generator is necessary

Now to the part, why not to use the distribution directly with the generator. If you try the following code

#include <boost/random/mersenne_twister.hpp>
#include <boost/random/normal_distribution.hpp>
#include <iostream>

int main()
{
    boost::mt19937 generator;    
    boost::normal_distribution<> distribution(0.0, 1.0);

    // WARNING: THIS DOES NOT WORK AS MIGHT BE EXPECTED!!
    for (int i = 0; i < 100; ++i)
        std::cout << distribution(generator) << std::endl;
    return 0;
}

you will see, that it outputs NaNs only (I've tested it with Boost 1.46). The reason is that the Mersenne twister returns a uniformly distributed integer random number. However, most (probably even all) continuous distributions require floating point random numbers from the range [0, 1). The example given in the Boost documentation works because uniform_int_distribution is a discrete distribution and thus can deal with integer RNGs.

Note: I have not tried the code with a newer version of Boost. Of course, it would be nice if the compiler threw an error if a discrete RNG is used together with a continuous distributuon.

Mehrwolf
  • 8,208
  • 2
  • 26
  • 38
  • Thank you for the helpful background information Mehrwolf! Just curious, why didn't the original post (and also your answer here) specify any typename in `boost::normal_distribution<> distribution(0.0, 1.0)`? does it make any difference if I write it as `boost::normal_distribution distribution(0.0, 1.0)`? (Actually I am confused why your usage is legal here, based on my very limited knowledge on class templates...) – Vokram Aug 12 '12 at 08:55
  • 1
    After some googling I think I got why. There is an default typename specified with the `=` in the definition of class template for the distribution object (i.e. `template`). So `double` is set to be the default typename here. – Vokram Aug 12 '12 at 09:33
  • Correct. double is the default type. – Mehrwolf Aug 12 '12 at 10:26