How to compute the percentiles from a normal distribution in python?

Question

Problem Statement - A random variable X is N(25, 4). Find the indicated percentile for X:

a. The 10th percentile

b. The 90th percentile

c. The 80th percentile

d. The 50th percentile

Attempt 1

My code:

import numpy as np
import math
import scipy.stats
mu=25
sigma=4
a=mu-(1.282*4)
b=mu+(1.282*4)

... like that. I got the values from the Zscore table given in https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_probability/bs704_probability10.html

Attempt 2

X=np.random.normal(25,4,10000) # sample size not mentioned in 
                                 problem. I just assumed it
a_9 = np.percentile(X,10)
b_9 = np.percentile(X,90)
c_9 = np.percentile(X,80)
d_9 = np.percentile(X,50)

But the answers are incorrect as per the hidden test cases of the practice platform. Can anyone please tell me the right way to compute the answers? Is there any scipy.stats function for this?

Why does the second attempt is incorrect? Do you have some test cases where it fails? — David, Feb 01 '21 at 06:41
Yes. My answers are not matching the predefined hidden answers of the test cases. — MVKXXX, Feb 01 '21 at 06:58
As I mentioned in comments, I had assumed the sample size to be 10000. It was not given in question. May be that is an issue.... I dont know....Is there any alternate way to approach the problem statement? — MVKXXX, Feb 01 '21 at 07:02
In attempt 2 you're filling X with random data, so percentiles will differ per execution. Z-scores are no fixed values but calculated `z = (x - mu) / sigma`, so filling x with random data will never deliver the same results. As you have the Z-scores for this dataset you can calculate the percentiles `mu+(z*sigma)` as per your first example. — RJ Adriaansen, Feb 01 '21 at 09:44

score 12 · Answer 1 · answered Mar 27 '21 at 12:04

You can use scipy.stats and built-in ppf function (look documentation)

import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt

mu = 25
sigma = 4

# define the normal distribution and PDF
dist = sps.norm(loc=mu, scale=sigma)
x = np.linspace(dist.ppf(.001), dist.ppf(.999))
y = dist.pdf(x)

# calculate PPFs
ppfs = {}
for ppf in [.1, .5, .8, .9]:
    p = dist.ppf(ppf)
    ppfs.update({ppf*100: p})

# plot results
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(x, y, color='k')
for i, ppf in enumerate(ppfs):
    ax.axvline(ppfs[ppf], color=f'C{i}', label=f'{ppf:.0f}th: {ppfs[ppf]:.1f}')
ax.legend()
plt.show()

that gives

score 6 · Answer 2 · answered Aug 02 '21 at 13:07

6

Use the ppf method from scipy.stats.norm (normal distribution).

scipy.stats.norm.ppf(0.1, loc=25, scale=4)

This function is analogous to the qnorm function in r. The ppf method gives the value of the random variable at the given percentile.

answered Aug 02 '21 at 13:07

Ananthu

139
1
9

This is cool, to get all the percentiles you'd do this: `scipy.stats.norm.ppf([0.1, 0.9, 0.8, 0.5], loc=25, scale=4)` to get `[19.87379374, 30.12620626, 28.36648493, 25.]`. 100th percentile gives `inf`, not from a stats background, not sure why. – Prox Mar 14 '22 at 08:59

score -1 · Answer 3 · edited Jan 25 '22 at 07:49

-1

a_9 = 19.88
b_9 = 30.12
c_9 = 28.36
d_9 = 25.00

X = np.random.normal(25,4,10000000)

edited Jan 25 '22 at 07:49

Clemsang

5,053
3
23
41

answered Jan 25 '22 at 07:43

Thirunavukarasu Veluswamy

1

Could you please elaborate? – Tejas Shetty Jan 31 '22 at 07:07

How to compute the percentiles from a normal distribution in python?

Attempt 1

Attempt 2

3 Answers3

Linked