What is the parameter a in scipy.stats.gamma library

Question

I am trying to fit Gamma CDF using scipy.stats.gamma but I do not know what exactly is the a parameter and how the location and scale parameters are calculated. Different literatures give different ways to calculate them and its very frustrating. I am using below code which is not giving correct CDF. Thanks in advance.

from scipy.stats import gamma 
loc = (np.mean(jan))**2/np.var(jan)
scale = np.var(jan)/np.mean(jan)
Jancdf  = gamma.cdf(jan,a,loc = loc, scale = scale)

Did https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html not help? — mkrieger1, Nov 08 '20 at 16:44

StupidWolf · Accepted Answer · 2020-11-08T19:15:46.440

a is the shape. What you have tried works only in the case where loc = 0. First we start with two examples, with shape (or a) = 10 and scale = 5, and the second d1plus50 differs from the first by 50, and you can see the shift which is dictated by loc:

from scipy.stats import gamma 
import matplotlib.pyplot as plt

d1 = gamma.rvs(a = 10, scale=5,size=1000,random_state=99)
plt.hist(d1,bins=50,label='loc=0,shape=10,scale=5',density=True)
d1plus50 = gamma.rvs(a = 10, loc= 50,scale=5,size=1000,random_state=99)
plt.hist(d1plus50,bins=50,label='loc=50,shape=10,scale=5',density=True)
plt.legend(loc='upper right')

So you have 3 parameters to estimate from the data, one way is use gamma.fit, we apply this on the simulated distribution with loc=0 :

xlin = np.linspace(0,160,50)

fit_shape, fit_loc, fit_scale=gamma.fit(d1)
print([fit_shape, fit_loc, fit_scale])

[11.135335235456457, -1.9431969603988053, 4.693776771991816]

plt.hist(d1,bins=50,label='loc=0,shape=10,scale=5',density=True)
plt.plot(xlin,gamma.pdf(xlin,a=fit_shape,loc = fit_loc, scale = fit_scale)

And if we do it for the distribution we simulated with loc, and you can see the loc is estimated correctly, as well as shape and scale:

fit_shape, fit_loc, fit_scale=gamma.fit(d1plus50)
print([fit_shape, fit_loc, fit_scale])

[11.135287555530564, 48.05688649976989, 4.693789434095116]

plt.hist(d1plus50,bins=50,label='loc=0,shape=10,scale=5',density=True)
plt.plot(xlin,gamma.pdf(xlin,a=fit_shape,loc = fit_loc, scale = fit_scale))

Thanks for quick response. It has almost resolved my issue. There is one small doubt. I thought that input array to fit gamma distribution (gamma.pdf(xlin,a=fit_shape,loc = fit_loc, scale = fit_scale)) should be d1 (data to be fitted) instead of xlin but that is not correct. Can you tell me little more to clear everything? — Vishal singh rajpoot, Nov 09 '20 at 06:49
in your case, if you need the cdf, you would do ```gamma.cdf(jan,a=fit_shape,loc = fit_loc, scale = fit_scale)``` . In the example above, i have obtained an estimate and I am just plotting to show the fit — StupidWolf, Nov 09 '20 at 09:00
Thank you very much. It took some time for me to get it. I regret that. This was a big issue for me. Thanks again. — Vishal singh rajpoot, Nov 10 '20 at 03:53
you're welcome :) yeah the scale, shape and loc thing for gamma is a bit confusing at times — StupidWolf, Nov 10 '20 at 20:38

What is the parameter a in scipy.stats.gamma library

1 Answers1