Fitting probability distribution to data

Question

I'm trying to fit data to probability distribution (gamma function in my case).

With the method of moments I achieved some success:

mean, var = data.mean(), data.var()
α, β = mean ** 2 / var, var / mean

x = np.linspace(0, 100)

plt.plot(x, gamma.pdf(x, α, 0, β))
# Pandas is in use
data.plot(kind='hist', xlim=(0, 100), bins=500, normed=True, color='lightblue')
data.dropna().plot(kind='kde', xlim=(0, 100), style='r--')

However this is not satisfying so I decided to use scipy's method fit:

args = gamma.fit(data)

x = np.linspace(0, 100)

plt.plot(x, gamma.pdf(x, *args))
data.plot(kind='hist', xlim=(0, 100), bins=500, normed=True, color='lightblue')
data.dropna().plot(kind='kde', xlim=(0, 100), style='r--')

But the only result I achieved was this distribution:

Could someone describe me what I'm doing wrong? I thought that gamma.fit should do at least as good as method of moments.

UPD:

gamma.fit returns this: (0.00077655597754514266, -6.0499462017751218e-25, 3.6337557495676194)

data.describe() shows that data is ok:

count    5546.000000
mean       45.601515
std        28.563211
min         0.000000
25%        35.000000
50%        42.000000
75%        52.000000
max      1488.000000
dtype: float64

Trying to filtering data with data[[data > 0]] force location to 0

args = gamma.fit(list(data[[data > 0]]), floc=0)
args

(7.897834117836718, 0, 5.7749678991293987)

gives better results. Seems like it's working now. Thanks!

Usually you fit distributions to your data, not the other way around ;-) — ali_m, Dec 05 '15 at 00:22
You almost certainly want to use `floc=0` when you call `gamma.fit`: http://stackoverflow.com/questions/18190426/gamma-distribution-fit-error/18192200#18192200 — Warren Weckesser, Dec 05 '15 at 00:46
See also: http://stackoverflow.com/questions/18703262/using-python-scipy-to-fit-gamma-distribution-to-data/18705847#18705847 — Warren Weckesser, Dec 05 '15 at 00:47
Also, could you show the output of `data.describe()`? Does it contain missing or infinite values? — ali_m, Dec 05 '15 at 00:52

Fitting probability distribution to data

0 Answers0