9

Suppose I have the variable x that was generated using the following approach:

x <- rgamma(100,2,11) + rnorm(100,0,.01) #gamma distr + some gaussian noise

    head(x,20)
 [1] 0.35135058 0.12784251 0.23770365 0.13095612 0.18796901 0.18251968
 [7] 0.20506117 0.25298286 0.11888596 0.07953969 0.09763770 0.28698417
[13] 0.07647302 0.17489578 0.02594517 0.14016041 0.04102864 0.13677059
[19] 0.18963015 0.23626828

How could I fit a gamma distribution to it?

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
user1172468
  • 5,306
  • 6
  • 35
  • 62
  • 2
    See function `MASS::fitdistr`. – Rui Barradas Aug 06 '17 at 20:22
  • 3
    Sorry, I've just seen your data more closely, and its graph. Why would you want to fit a gamma to it? `all.equal(seq(0, 1, by = 0.01), x)` returns `TRUE`. – Rui Barradas Aug 06 '17 at 20:36
  • @RuiBarradas -- thanks -- let me update the question -- I generated the data by adding a small amount of gaussian noise to gamma deviates -- let me post that in the notes – user1172468 Aug 06 '17 at 21:07
  • @RuiBarradas -- updated teh question thanks. – user1172468 Aug 06 '17 at 21:20
  • there is an example of doing MLE for a Gaussian-contaminated Gamma sample of this sort in [my book](https://ms.mcmaster.ca/~bolker/emdbook/book.pdf) (page 433), although I doubt it'll work well for this data set -- the standard dev of the Gaussian component is prob. too small to distinguish it. – Ben Bolker Aug 07 '17 at 18:23

3 Answers3

14

A good alternative is the fitdistrplus package by ML Delignette-Muller et al. For instance, generating data using your approach:

set.seed(2017)
x <- rgamma(100,2,11) + rnorm(100,0,.01)
library(fitdistrplus)
fit.gamma <- fitdist(x, distr = "gamma", method = "mle")
summary(fit.gamma)

Fitting of the distribution ' gamma ' by maximum likelihood 
Parameters : 
       estimate Std. Error
shape  2.185415  0.2885935
rate  12.850432  1.9066390
Loglikelihood:  91.41958   AIC:  -178.8392   BIC:  -173.6288 
Correlation matrix:
          shape      rate
shape 1.0000000 0.8900242
rate  0.8900242 1.0000000


plot(fit.gamma)

enter image description here

Edgar Santos
  • 3,426
  • 2
  • 17
  • 29
  • 1
    There might be a problem with this solution. There are negative values of `x`. `sum(x < 0)`returns 5. `MASS::fitdistr` when applied to this vector gives an error: `Error in fitdistr(x, "gamma") : gamma values must be >= 0`. So maybe package `fitdistrplus` is doing all the checks it should. – Rui Barradas Aug 06 '17 at 21:37
  • @RuiBarradas, thanks. good point. Edited above. The function does not accept negative values. – Edgar Santos Aug 06 '17 at 22:13
4

You could try to quickly fit Gamma distribution. Being two-parameters distribution one could recover them by finding sample mean and variance. Here you could have some samples to be negative as soon as mean is positive.

set.seed(31234)
x <- rgamma(100, 2.0, 11.0) + rnorm(100, 0, .01) #gamma distr + some gaussian noise
#print(x)

m <- mean(x)
v <- var(x)

print(m)
print(v)

scale <- v/m
shape <- m*m/v

print(shape)
print(1.0/scale)

For me it prints

> print(shape)
[1] 2.066785
> print(1.0/scale)
[1] 11.57765
> 
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
  • 4
    true (this approach is called the "method of moments") but I would want to be **very** careful papering over cracks (presence of negative values in what is purportedly a Gamma-distributed sample) in this way ... – Ben Bolker Aug 07 '17 at 18:13
1

You could also try to fit Gamma distribution with the fast and efficient Le Cam one-step estimation procedure using the onestep command in the OneStep package.

library(OneStep)

set.seed(2017)
x <- rgamma(100,2,11) + rnorm(100,0,.01)
onestep(x,"gamma")

Parameters:
       estimate
shape  2.183281
rate  12.837968