1

So I did a fair amount of looking around stack overflow and google before asking this question. I have a simulation that I am working on and need to be able to generate random input data between 0 and 1.0 that follow certain statistical distributions.

So far, I have gotten a normal distribution and a uniform real distribution working properly, but am stuck with the Pareto distribution.

The first two are available in boost/random/, but the pareto is only available as a raw distribution (i.e. not available to be used in the variate generator). Does anyone know of a way to generate said random numbers? Please note that I have already poured over the boost documentation, Including for the Pareto distribution. I am looking to generate random numbers that follow a Pareto distribution, not use a Pareto distribution to determine statistical probabilities. The only thing I can think of so far is to use a uniform generator and plug those values into the CDF of the Pareto distribution (but there has to be a better way than this).

Any help would be greatly appreciated as I am new to boost.

Thank you!

Here is the code I am using for the first two, in tandem with a variant generator. This is all very much test code, so please don't hammer me on style or conventions:

#include <time.h>
#include <iostream>
#include <boost/random/normal_distribution.hpp>
#include <boost/random/uniform_real_distribution.hpp>
#include <boost/math/distributions/pareto.hpp>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/variate_generator.hpp>

int main(){
    boost::mt19937 randGen(time(0));

    boost::normal_distribution<> dist1(.5,.2);
    boost::random::uniform_real_distribution<> dist2(0.0,1.0);

    boost::variate_generator<boost::mt19937&,boost::normal_distribution<> > generator1(randGen,dist1);
    boost::variate_generator<boost::mt19937&,boost::random::uniform_real_distribution<> > generator2(randGen,dist2);

    for(int x = 0; x < 10; x++)
        std::cout << generator1() << std::endl;

    std::cout << "\n\n\n";

    for(int x = 0; x < 10; x++)
        std::cout << generator2() << std::endl;

    return 0;
}
MS-DDOS
  • 578
  • 5
  • 15
  • But somehow you missed the first hit for "boost pareto distribution"? http://www.boost.org/doc/libs/1_41_0/libs/math/doc/sf_and_dist/html/math_toolkit/dist/dist_ref/dists/pareto.html – Alan Stokes Mar 19 '15 at 22:55
  • Not at all. This doesn't answer my question. This is how to use a Pareto distribution for determining probabilities according to the distribution. I need to generate random numbers that follow a Pareto distribution as sample input data. Please take the time to understand the essence of the question before leaping to conclusions. – MS-DDOS Mar 19 '15 at 23:09
  • 1
    @TylerS maybe you couldn't make the connection and the commentor didn't realize that. Me too, I read this as "I can't find the implementation of the Pareto distribution". In general it's easy to preempt the answers you anticipate by pointing to it in your question. ("I know that boost implements the statistical distribution here, but I'm not sure how I can apply that (if at all) to Boost Random generators") – sehe Mar 20 '15 at 12:19
  • Good point. I edited the body to hopefully make that more clear. Thank you for the advice. – MS-DDOS Mar 20 '15 at 16:58

2 Answers2

2

The pareto distribution is related to the exponential distribution. So you could use boost to generate random values which follow an exponential distribution and manually calculate pareto distributed values from them. This question might also be of interest for you.

sigy
  • 2,408
  • 1
  • 24
  • 55
0

After doing some more research and consulting some people in the stats department I found a way to do this using a uniform_real distribution. I originally tried to use the quantile function, as described in this post, but always got strings of 1's or 0's as my results.

After some additional trial and error I found that essentially all you need to do is plug the results of uniform real random into the cdf complement function.

Boost is interesting in that it uses non-member functions to calculate the cdf values, so the cdf is not a property of the parteo distribution itself. Instead the proper way to do this in boost is:

#include <boost/random/uniform_real_distribution.hpp>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/variate_generator.hpp>
#include <boost/math/distributions/pareto.hpp>

int main(){
     boost::mt19937 randGen(15); //held constant for repeatability
     boost:math::pareto_distribution<> dist;
     boost::random::uniform_real_distribution<> uniformReal(1.0,10.0); //this range can be adjusted to effect values

     boost::variate_generator<boost::mt19937&,boost::random::uniform_real_distribution<> > generator(randGen, dist);

     double cdfComplement;
     for(int i = 0; i < 5000; i++){
          cdfComplement = boost::math::cdf(complement(dist,generator()));
          //do something with value
     }         

     return 0;
}

There is no good way that I have found, as of yet, to limit the values of the distribution to values exactly within the 0.0 to 1.0 range. There are outliers that dip slightly below 0.0 and others that go just over 1.0 (though this depends entirely on the range of real numbers that you feed into it). You can easily throw away values outside the range you are look for.

I was able to achieve these results using the default shape parameters and the method described above. There are 5,000 data points shown: Pareto random values with default shape parameters and 5,000 data points.

Community
  • 1
  • 1
MS-DDOS
  • 578
  • 5
  • 15