4

Basically, I have plotted a normal curve by using the values of mean and standard deviation. The y-axis gives the probability density.

How do I find the probability at a certain value "x" on the x-axis? Is there any Python function for it or how do I code it?

  • 1
    If you have a continuous distribution, the probability of any value is by definition zero... The only question you can ask is what's the probability of x being above/bellow some value or within some range. Then this should answer https://www.dummies.com/education/math/statistics/how-to-find-statistical-probabilities-in-a-normal-distribution/ Or am I missing something? – My Work May 10 '20 at 17:46

2 Answers2

5

Not very sure if you mean the probability density function, which is:

enter image description here

given a certain mean and standard deviation. In python you can use the stats.norm.fit to get the probability, for example, we have some data where we fit a normal distribution:

from scipy import stats
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

data = stats.norm.rvs(10,2,1000)

x = np.linspace(min(data),max(data),1000)
mu, var = stats.norm.fit(data)
p = stats.norm.pdf(x, mu, std)

Now we have estimated the mean and standard deviation, we use the pdf to estimate probability at for example 12.5:

xval = 12.5
p_at_x = stats.norm.pdf(xval,mu,std)

We can plot to see if it is what you want:

fig, ax = plt.subplots(1,1)
sns.distplot(data,bins=50,ax=ax)
plt.plot(x,p)
ax.hlines(p_at_x,0,xval,linestyle ="dotted")
ax.vlines(xval,0,p_at_x,linestyle ="dotted")

enter image description here

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
0

A scipy.stat distribution includes these 3 methods:

pdf(x) the value of the pdf at x. This is what you asked for.

cdf(x) the cumulative probability at x.

ppf(p) the inverse of the cdf(). The critical value that gives cumulative probability, p.

import scipy.stats as stats
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# plot a normal distribution and use scipy.stats to obtain
# probabilities and critical values (percentiles).
# Using scipy.stats, this can be done for any distribution
# listed in the documentation: https://docs.scipy.org/doc/scipy/reference/stats.html.
# scipy is included in the standard Anaconda python distribution.

loc = 0        # the mean
scale = 1      # the standard deviation
# a scipy.stats normal distribution
# scipy.stats supports 50+ continuous distributions.
d = stats.norm(loc, scale)

# a scipy.stat distribution includes these 3 methods:
#   norm.pdf(x)     # the value of the pdf at x. This is what you asked for.
#   norm.cdf(x)     # the cumulative probability at x.
#   norm.ppf(p)     # the inverse of the cdf(). The critical value that gives cumulative probability, p.

# d.pdf(x) gives the probability you asked for.
print(f'The value of the pdf at x = 0 (the 50th percentile, a.k.a. the median: {d.pdf(0)}')
# d.cdf(x) gives the cumulative probability at x (x is a critical value of the normal distribution.
print(f'The value of the cumulative distribution at x = .5 (the 50th percentile, a.k.a. the median: {d.cdf(d.ppf(.5))}')
# d.ppf(p) is the inverse of cdf. The critical value that gives cumulative probability, p.
print(f'The normal critical value that gives a cumulative probability = .5: {d.ppf(.5)}')

# plot the distribution over these percentiles.
quantile_range = (.01, .99)
# generate sample_size quantile values for the x-axis
# of the plot of the probability distribution function (pdf)
sample_size = 100
x = np.linspace(d.ppf(quantile_range[0]), d.ppf(quantile_range[1]), sample_size)

y = d.pdf(x)        # return an array of probabilities (pdf values) for x
# setup the plot area
plt.style.use('seaborn-darkgrid')
fig, ax = plt.subplots()
# If ypu move your mouse along the curve, you will
# see the value of the pdf in in the lower left of the plot (mouse tips)
ax.plot(x, y, color='black', linewidth=1.5)

plt.show()
plt.close()

enter image description here

tmck
  • 23
  • 4