2

Let's look for example at std::normal_distribution.

I know that there is a handful of algorithms to sample from a normal distribution (reference: Wikipedia). And I know that the standard specification usually leaves it to the implementation to choose the algorithm (reference: SO).

However, it is sometimes required to specify which algorithm is in use under the hood. How to find out the details about the implementation?

(I admit that I don't know much about the different implementations of the C++ standard library that exist in this world. Mostly, I'm using the ones shipped with XCode/clang, gcc and MSVC.)

Community
  • 1
  • 1
normanius
  • 8,629
  • 7
  • 53
  • 83
  • 1
    Read the source code. Why do you need to know the exact algorithm? – Dietrich Epp Dec 27 '15 at 21:58
  • 2
    @DietrichEpp: Why to know the algorithm? For proper documentation, out of interest, to link theory with practice... – normanius Dec 27 '15 at 22:18
  • 1
    @normanius: The two common algorithms are the [Box-Muller transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) and the [Ziggurat algorithm](https://en.wikipedia.org/wiki/Ziggurat_algorithm). They are fairly straightforward to implement yourself, if you want to learn about statistical techniques. The reason why the C++ standard doesn't specify the algorithm is because a different algorithm will be a better choice depending on the kind of computer you are using (e.g. RAM constraints or slow transcendental functions). – Dietrich Epp Dec 27 '15 at 22:23
  • 2
    @DietrichEpp Also, the [Marsaglia polar method](https://en.wikipedia.org/wiki/Marsaglia_polar_method), which last time I checked was actually used by libstdc++, libc++, and MSVC. – T.C. Dec 27 '15 at 22:28
  • @DietrichEpp I understand why not to fix the implementation details. Still, I hoped to find a reference document that wraps up the choices made by a certain standard library and that refers to actual algorithms (default: Marsaglia polar, special case A: Box-Muller, special case B: ...) – normanius Dec 27 '15 at 23:25
  • @normanius: Besides reading the source code, you can also read the commit messages which added each particular algorithm to the source, at least for libstdc++ and libc++. This is kind of a code archaeology question. – Dietrich Epp Dec 27 '15 at 23:29

3 Answers3

1

Sometimes a utility's behavior is explicitly defined by the standard, sometimes it's not.

  1. Look in the standard; if found, it's strictly-conforming and portable. Yoohay!
  2. If not specified or explicitly implementation-defined, look into the standard library implementation of your choice. The source code will explain it.
    Unfortunately it's implementation-defined and unportable then. If specified by POSIX or something similar, yoohay again, but only for POSIX-conforming or "something similar"-conforming platforms.

Here's an example:

The C++14 standard draft N4296 says in §26.5.8.5.1:

A normal_distribution random number distribution produces random numbers x distributed according to the probability density function

formula

The distribution parameters µ and σ are also known as this distribution's mean and standard deviation.

I have no idea about PRNG, so I cannot explain this formula to you but I think this is the thing you were looking for.

There you have a function (more specifically: a "probability density function") for calculating random numbers using normal distribution. The whole algorithm builds around this and can be found in the corresponding standard library implementation.

T.C.
  • 133,968
  • 17
  • 288
  • 421
cadaniluk
  • 15,027
  • 2
  • 39
  • 67
  • 3
    Since that quote basically says "`normal_distribution` produces normally distributed random numbers", it's quite unlikely that it's what the OP's looking for. – T.C. Dec 27 '15 at 22:15
  • @T.C. Done. Correct now? And regarding your last comment: does the function not add a whole lot of more information to the quotation? – cadaniluk Dec 27 '15 at 22:30
  • 2
    That function just says that the generated random numbers are normally distributed, i.e., that `normal_distribution` does what it is advertised to do. It doesn't add anything beyond that. – T.C. Dec 27 '15 at 22:55
  • 1
    The situation looks quite disappointing. What I learn from your answer and comments is that there is generally little user documentation about the implementation. For the particular example I was referring to (`std::normal_distribution()`), I found only some information for [gcc libstdc++](https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-api-4.6/a00264.html#add4f70b08330485b611f81b2b7cbb257), but no details for [msvc](https://msdn.microsoft.com/en-us/library/bb982827.aspx) and nothing for XCode/clang at all. – normanius Dec 27 '15 at 23:17
  • @normanius: Just to clarify, this is text from the ISO C++ Standard (or more precisely, a draft version). The audience of the C++ Standard is compiler writers and C++ book authors. It's definitely not end-user documentation. – MSalters Dec 28 '15 at 03:09
  • @cad: The real problem with that text is that its audience actually knows `normal_ditstribution` isn't a PRNG at all and does not produce random numbers at all (!). The text is quite misleading. In reality, `normal_distribution` is a transformation of a sequence of independent random integers, and t assumes the input has a flat distribution, – MSalters Dec 28 '15 at 03:12
0

I had to write a Wrapper class like this:

struct Generator{
  Generator() : val(0), count(0) {}
  Generator(std::mt19937&& aGen) : val(0), count(0), theGen(aGen) {}
  long long int operator()(void){
    val = theGen();
    std::cout << val << " " << count << std::endl;
    ++count;
    return val;
  }
  long long int max(){return theGen.max();};
  long long int min(){return theGen.min();};
  long long int val;
  size_t count;
  std::mt19937 theGen;
}; 

to get some introspection on how my compiler implements normal_distribution.

In order to see it in action you have to write something along these lines:

std::normal_distribution<> dis(1.0, 2.0);
Generator gen(std::mt19937 (42));
dis(gen);

calling iteratively dis(gen); could be instructive

jimifiki
  • 5,377
  • 2
  • 34
  • 60
0

Normal Distribution is found here

Core logic is

        uniform_real_distribution<result_type> _Uni(-1, 1);
        result_type __u;
        result_type __v;
        result_type __s;
        do
        {
            __u = _Uni(__g);
            __v = _Uni(__g);
            __s = __u * __u + __v * __v;
        } while (__s > 1 || __s == 0);
        result_type _Fp = _VSTD::sqrt(-2 * _VSTD::log(__s) / __s);
        __v_ = __v * _Fp;
        __v_hot_ = true;
        _Up = __u * _Fp;
Sam P
  • 681
  • 5
  • 19