scipy.stat.norm.pdf does not add up to one

Question

I'm trying to compute the integral of a gaussian from an arbitrary point gamma as follows:

import numpy as np
from scipy.stats import norm

def func(mu, sigma, gamma):
    x = np.linspace(mu-4*sigma, mu + 4*sigma, 100)
    y = norm.pdf(x, mu, sigma)

    area = y[x>=gamma].sum()

    print(f"Area is ~ {area:.3f}")

    plt.plot(x,y, label=f'μ0 = {mu}, σ={sigma}')
    plt.fill_between(x[x>gamma],y[x>gamma], alpha=.5, label=f'area={area:.3f}')
    plt.title("Example")
    plt.legend(loc='best')
    plt.show()

# Execute the function
func(0,10,-20)

Output is: Area is ~ 1.211

The blue area is integrated but it adds more than one (far more), even when it's not the complete function.

I found this related but didn't help.

Why is it adding up more than one?

Your approximation of the integral is not correct. Each term being summed must be multiplied by `dx`, the spacing of the `x` samples. Try `area = y[x>=gamma].sum()*(x[1] - x[0])` — Warren Weckesser, Oct 09 '18 at 04:18
Thanks! suach a newby mistake xD I fixed it and posted as an answer for completness — Rodrigo Laguna, Oct 09 '18 at 04:33

score 1 · Accepted Answer · edited Oct 09 '18 at 04:36

As @Warren Weckesser says, the problem is that I'm not considering the dx term (the base size for the intergal)

Fortunately this is easy to fix, so for completness, here it goes...

import numpy as np
from scipy.stats import norm

def func(mu, sigma, gamma):
    # Notice the retstep=True param, it will return the dx needed
    x, dx = np.linspace(mu-4*sigma, mu + 4*sigma, 100, retstep=True)
    y = norm.pdf(x, mu, sigma)

    # multiply all with dx
    area = y[x>=gamma].sum() * dx

    print(f"Area is ~ {area:.3f}")

    plt.plot(x,y, label=f'μ0 = {mu}, σ={sigma}')
    plt.fill_between(x[x>gamma],y[x>gamma], alpha=.5, label=f'area={area:.3f}')
    plt.title("Example")
    plt.legend(loc='best')
    plt.show()

func(0,10,-20)

Now output makes sence, it's Area is ~ 0.978

and the image is

Thanks a lot @Warren Weckesser!!

scipy.stat.norm.pdf does not add up to one

1 Answers1