6

I have an std::vector<double> or random numbers [0, 1]. (How) Can I use the standard library to convert it to a specific (e.g. Weibull) distribution (using the distribution's cumulative distribution)?

To be clear: I don't have what the standard library considers a "generator" (I don't have a class whose operator() returns an integer). I already have a list of random doubles [0, 1] and want to just use the standard library's implementation of the cumulative density function of the different distributions (e.g. Weibull).

Tom de Geus
  • 5,625
  • 2
  • 33
  • 77
  • You'll probably need to bin your values or the distribution won't look like much. If any representable value between 0 and 1 can be contained in your distribution, it will probably be very sparse with few values occuring more than once. – François Andrieux May 05 '21 at 14:55
  • You could try to iterate over the vector and increment a value in a `map` for each occurrence. But you'll likely end up with a huge map of elements at value 1. This is related to my previous comment. – François Andrieux May 05 '21 at 15:00
  • @FrançoisAndrieux Thanks for your replies! What is not entirely clear to me is: in theory I should be able to convert a set of floating-points between 0 and 1 to a distribution without problems as long as the cumulative density function is analytic and continuous. Numerically there might be some rounding errors, but other than that this should still be possible. So, is what you're describing a limitation specific to using the CDF from STL? – Tom de Geus May 05 '21 at 15:05
  • 3
    This problem is unrelated to the standard library. I misunderstood the question, I am proposing naive solutions and related problems when converting a set of values to a histogram (a representation of density) and not a density function. If you want to calculate the density *function* then you will probably have to implement an algorithm yourself or use a third party library. The standard distribution objects are strictly for shaping random bytes. You won't be able to automatically create a distribution object which matches a provided collection of values. – François Andrieux May 05 '21 at 15:09
  • You will be in better shape to use the STL RNG facilities if you start somehow from a vector of integers that are uniformly distributed. Starting from doubles is problematic, you have to do a lot of work, 1) apply the tranformation of variables 2) ensure that you have enough entropy for a new number to be generated. The best you can do at this point is to convert these doubles to integers in the range 0 to MAX_INT for example but really this sounds still bad. – alfC May 05 '21 at 15:39
  • @TomdeGeus You should read [What's the difference between “STL” and “C++ Standard Library”?](https://stackoverflow.com/questions/5205491/whats-the-difference-between-stl-and-c-standard-library) You probably don't actually mean STL here. – François Andrieux May 05 '21 at 17:26

2 Answers2

3

No you can't, not without iterative techniques.

The standard library is not quite a mathematics package. You need the quantile function for the desired distribution in order to convert a uniform drawing in [0, 1) (perhaps [0, 1] if you're lucky) to your distribution, and the standard library doesn't specify those functions. Note that the quantile function is the inverse of the cumulative distribution function.

Having faced exactly this problem in the past, I resorted to functions from the Boost distribution.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • Thanks, however unfortunate that this is not part of STL this is a perfect answer to my question. Thanks for the boost suggestion! – Tom de Geus May 05 '21 at 16:11
-1

I don't have what the standard library considers a "generator" (I don't have a class whose operator() returns an integer)

So make one!

A little class that reads your list of random numbers and then outputs the "next" one when () is called.

ravenspoint
  • 19,093
  • 6
  • 57
  • 103
  • 1
    A generator generates random bits. Distributions then take those bits to generate values. Even after you convert a collections of `double` into a bit stream, the distribution objects could transform them however they want. You couldn't reliably get what OP is looking for, which is a cumulative density distribution. – François Andrieux May 05 '21 at 14:57
  • Fair point. I just really want to avoid passing through integers. – Tom de Geus May 05 '21 at 14:58