0

I have a piecewise quartic distribution with a probability density function:

p(x)= c(x/a)^2 if 0≤x<a; 
      c((b+a-x)^2/b)^2 if a≤x≤b;
      0 otherwise

Suppose c, a, b are known, I am trying to draw 100 random samples from the distribution. How can I do it with numpy/scipy?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Chiefscreation
  • 181
  • 3
  • 12
  • 1
    Your function appears piecewise quartic rather than quadratic. Also -- is this intended to be the *density* function or the *cumulative distribution* function? – John Coleman Jan 04 '17 at 17:50
  • @JohnColeman thanks! have edited it. this is a density function. – Chiefscreation Jan 04 '17 at 18:16
  • 3
    One standard way is to find an explicit formula, `F^-1` for the inverse of the cumulative distribution function. That is doable here (although it will naturally be piecewise defined) and then use `F^-1(U)` where `U` is uniform on [0,1] to generate your samples. Note that `c` is easily computed from `a` and `b` by the requirement that the result integrates to 1. – John Coleman Jan 04 '17 at 18:33
  • @John Coleman your comment could be an answer, because, er, this is *the* answer – ev-br Jan 05 '17 at 01:40
  • @ev-br I resisted the idea at first since the algebra seemed daunting and I didn't like the idea of posting a code-free answer, but then I found a way to simplify it a bit. – John Coleman Jan 05 '17 at 15:49

2 Answers2

5

One standard way is to find an explicit formula, G = F^-1 for the inverse of the cumulative distribution function. That is doable here (although it will naturally be piecewise defined) and then use G(U) where U is uniform on [0,1] to generate your samples.

In this case, I think that I worked out the details, but you will need to check the Calculus/Algebra.

First of all, to streamline things it helps to introduce a couple of new parameters. Let

f(a,b,c,d,x) = c*x**2 #if 0 <= x <= a

and

f(a,b,c,d,x) = d*(x-e)**4 #if a < x <= b

Then your p(x) is given by

p(x) = f(a,b,c/a**2,c/b**2,a+b)

I integrated f to find the cumulative distribution and then inverted and got the following:

def Finverse(a,b,c,d,e,x):
    if x <= (c*a**3)/3:
        return (3*x/c)**(1/3)
    else:
        return e + ((a-e)**5 - (5*c*a**3)/(3*d))**(1/5)

Assuming this is right, then simply:

def randX(a,b,c):
    u = random.random()
    return Finverse(a,b,c/a**2,c/b**2,a+b,u)

In this case it was possible to work out an explicit formula. When you can't work out such a formula for the inverse, consider using the Monte Carlo methods described by @lucianopaz

John Coleman
  • 51,337
  • 7
  • 54
  • 119
4

As your function is bounded both in x and p(x), I recommend that you use Monte Carlo rejection sampling. The basic principle is that you draw two uniform random numbers, one representing a candidate x in the x space bounds [0,b] and another representing y. If y is lower or equal to the normalized p(x), then the sampled x is returned, if not it continues to the next iteration

import numpy as np
def rejection_sampler(p,xbounds,pmax):
    while True:
        x = np.random.rand(1)*(xbounds[1]-xbounds[0])+xbounds[0]
        y = np.random.rand(1)*pmax
        if y<=p(x):
            return x

Here, p should be a callable to your normalized piecewise probability density, xbounds can be a list or tuple containing the lower and upper bounds, and pmax the maximum of the probability density in the x interval.

lucianopaz
  • 1,212
  • 10
  • 17