14

We have uniform_int_distribution and uniform_real_distribution, wouldn't it be possible to have an encompassing uniform_distribution , which specializes in float/double and int/..., when specified?

dylan-myers
  • 323
  • 3
  • 18
mike
  • 1,670
  • 11
  • 21
  • 10
    One thought: uniform_int_distribution uses a closed interval, uniform_real_distribution uses a half-open interval. – user877329 Aug 14 '16 at 07:52
  • Can't you define your own structure and specialize it? It's trivial indeed. – skypjack Aug 14 '16 at 08:25
  • 1
    @AmiTavory [reference](http://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution) does not agree. Otherwise you would not be able to get INT_MAX and similar from it. Edit: standard does not agree too, in `[rand.dist.uni.int]` it states: _A `uniform_int_distribution` random number distribution produces random integers i, a <= i <= b_ – Revolver_Ocelot Aug 14 '16 at 09:13
  • @Revolver_Ocelot You're right, thanks. – Ami Tavory Aug 14 '16 at 09:20
  • @user877329: conceptually, _any real uniform distribution is on an interval that's open on both sides_ ([the probability of getting a point on the boundary is zero](https://en.wikipedia.org/wiki/Almost_surely)). Whether a concrete floating-point implementation allows points on the ends to happen is of course important to know, but it's not something that would sensibly require giving it a different name from the closed-interval integral distribution. Frankly, a well-behaved application of real distributions shouldn't depend on whether a single particular value turns up or not. – leftaroundabout Aug 14 '16 at 13:02

1 Answers1

15

AFAIU, the comments above by @user877329 and @revolver_ocelot explain this correctly, and the other answer is completely wrong.

It is wrong to unify the interfaces of uniform_int and uniform_real, not because they are implemented differently (which can be solved via template specialization), but because the interfaces mean different things.

Suppose we unify the interfaces (using a variation of the suggestion in the other answer), like so:

template <typename T>
using uniform_distribution =
    typename std::conditional<
        std::is_integral<T>::value,
        std::uniform_int_distribution<T>,
        std::uniform_real_distribution<T>
    >::type;

Then if we define uniform_distribution<some_type> u(0, 9), the meaning is very different:

  • if some_type is integral, then u will output 9 approximately 1/10ths of the time.

  • if some_type is not, then u will never output 9.

The following code (whose output is true and then false) illustrates this:

#include <random>
#include <iostream>
#include <type_traits>                                         

template <typename T>
using uniform_distribution =
    typename std::conditional<
        std::is_integral<T>::value,
        std::uniform_int_distribution<T>,
        std::uniform_real_distribution<T>
    >::type;

int main()
{
    std::random_device rd;
    std::mt19937 gen(rd());

    {
        uniform_distribution<int> u(0, 9);
        bool over_found = false;
        for(size_t i = 0; i < 99999; ++i)
            over_found = over_found || u(gen) >= 9;
        std::cout << std::boolalpha << over_found << std::endl;
    }

    {
        uniform_distribution<float> u(0, 9);
        bool over_found = false;
        for(size_t i = 0; i < 99999; ++i)
            over_found = over_found || u(gen) >= 9;
        std::cout << std::boolalpha << over_found << std::endl;
    }
}

This code illustrates that writing generic code using this class is dangerous. For example, if you'd write a generic function calculating a histogram of the results in the subranges: [0, 1), [1, 2), ..., [8, 9), the results would be incompatible.


As @revolver_ocelot points out, the standard library's [inclusive-begin, exclusive_end) convention cannot be used for uniform integers (because it would be impossible to specify a uniform integer random number generator generating also the maximum uint value), making this an exceptional signature.

Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • 2
    Inclusive or exclusive range makes no difference because the probability to get a specific real number such as 9 is 0 anyways, so never seeing a 9 is expected. Granted doubles and floats are not real numbers in the mathematical sense, but that difference is negligible considering we are talking about random numbers and floats and doubles are only approximations for reals anyways. – nwp Aug 14 '16 at 11:00
  • @nwp Hence also my comment about the generic function drawing a histogram. Surely the probability of an event such as *[8, 9)* is nonzero even for math-like real numbers. Unfortunately, standard-library type bins, would leave out a non-negligible part of the results given the union of these events. In a very practical sense, these signatures are just different. – Ami Tavory Aug 14 '16 at 11:11
  • 2
    @nwp: "the probability to get a specific real number such as 9 is 0 anyways" -- that's untrue, because despite the name `uniform_real_distribution` is not a continuous distribution over the real numbers. It's a discrete distribution over the finite subset of the rational numbers that are represented by `RealType` (which by default is `double`). Including or excluding the endpoint therefore *does* make a difference. – Steve Jessop Aug 14 '16 at 13:19
  • Granted the probability of generating that specific value is *usually* small, but for example if your range is `[1, 1 + DBL_EPSILON)` then including or not including the endpoint is the difference between the distribution being two-valued or one-valued. Anyway, if you want to take your random number and do something like `1 / (endpoint - x)` to it, then you care whether there is or isn't a small probability of your code crashing. Not that the distribution of `1 / (endpoint - x)` is perfect, but it's a lot worse if it sometimes divides by zero. – Steve Jessop Aug 14 '16 at 13:25
  • 1
    ... so you can probably argue that the standard *should* have included the endpoint, and users *shouldn't* write code that relies on the endpoint being excluded. I don't really know the arguments for one or the other. But it certainly does make a difference whether they included it or not, and they decided to support uses that want it excluded. – Steve Jessop Aug 14 '16 at 13:28