python: random sampling from self-defined probability function

Question

I have a piecewise quartic distribution with a probability density function:

p(x)= c(x/a)^2 if 0≤x<a; 
      c((b+a-x)^2/b)^2 if a≤x≤b;
      0 otherwise

Suppose c, a, b are known, I am trying to draw 100 random samples from the distribution. How can I do it with numpy/scipy?

Your function appears piecewise quartic rather than quadratic. Also -- is this intended to be the *density* function or the *cumulative distribution* function? — John Coleman, Jan 04 '17 at 17:50
@JohnColeman thanks! have edited it. this is a density function. — Chiefscreation, Jan 04 '17 at 18:16
One standard way is to find an explicit formula, `F^-1` for the inverse of the cumulative distribution function. That is doable here (although it will naturally be piecewise defined) and then use `F^-1(U)` where `U` is uniform on [0,1] to generate your samples. Note that `c` is easily computed from `a` and `b` by the requirement that the result integrates to 1. — John Coleman, Jan 04 '17 at 18:33
@John Coleman your comment could be an answer, because, er, this is *the* answer — ev-br, Jan 05 '17 at 01:40
@ev-br I resisted the idea at first since the algebra seemed daunting and I didn't like the idea of posting a code-free answer, but then I found a way to simplify it a bit. — John Coleman, Jan 05 '17 at 15:49

score 5 · Accepted Answer · answered Jan 05 '17 at 15:46

One standard way is to find an explicit formula, G = F^-1 for the inverse of the cumulative distribution function. That is doable here (although it will naturally be piecewise defined) and then use G(U) where U is uniform on [0,1] to generate your samples.

In this case, I think that I worked out the details, but you will need to check the Calculus/Algebra.

First of all, to streamline things it helps to introduce a couple of new parameters. Let

f(a,b,c,d,x) = c*x**2 #if 0 <= x <= a

and

f(a,b,c,d,x) = d*(x-e)**4 #if a < x <= b

Then your p(x) is given by

p(x) = f(a,b,c/a**2,c/b**2,a+b)

I integrated f to find the cumulative distribution and then inverted and got the following:

def Finverse(a,b,c,d,e,x):
    if x <= (c*a**3)/3:
        return (3*x/c)**(1/3)
    else:
        return e + ((a-e)**5 - (5*c*a**3)/(3*d))**(1/5)

Assuming this is right, then simply:

def randX(a,b,c):
    u = random.random()
    return Finverse(a,b,c/a**2,c/b**2,a+b,u)

In this case it was possible to work out an explicit formula. When you can't work out such a formula for the inverse, consider using the Monte Carlo methods described by @lucianopaz

score 4 · Answer 2 · answered Jan 04 '17 at 18:43

As your function is bounded both in x and p(x), I recommend that you use Monte Carlo rejection sampling. The basic principle is that you draw two uniform random numbers, one representing a candidate x in the x space bounds [0,b] and another representing y. If y is lower or equal to the normalized p(x), then the sampled x is returned, if not it continues to the next iteration

import numpy as np
def rejection_sampler(p,xbounds,pmax):
    while True:
        x = np.random.rand(1)*(xbounds[1]-xbounds[0])+xbounds[0]
        y = np.random.rand(1)*pmax
        if y<=p(x):
            return x

Here, p should be a callable to your normalized piecewise probability density, xbounds can be a list or tuple containing the lower and upper bounds, and pmax the maximum of the probability density in the x interval.

python: random sampling from self-defined probability function

2 Answers2

Linked