3

In my dataset I have 15 observations and I want to test whether this distribution can be represented with an exponential distribution with rate=0.54. The variable x is as follows:

table(x)
x
0  1  2  4  5  7  8 10 
2  1  4  2  2  2  1  1 

Any idea how to implement this in R?

Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
brock
  • 181
  • 2
  • 10
  • 1
    This is not a dup @akrun, at least the link you have given does not use chi-square test to find whether the data is drawn from a specific distribution . – Sandipan Dey Feb 10 '17 at 07:52
  • 2
    @akrun Sorry but I don`t think it's a duplicate question! Please read the question carefully. – brock Feb 10 '17 at 07:52
  • @SandipanDey It is somebody else forwarded the dupe link and tagged it. That is all – akrun Feb 10 '17 at 07:53

2 Answers2

1

We can try something like

set.seed(1)
observed <- c(2,  1,  4,  2,  2,  2,  1,  1)
prob.exp <- dexp(c(0,  1,  2,  4,  5,  7,  8, 10), rate=0.54) # prob for the exp dist. variable for the values
chisq.test(observed, p=prob.exp, rescale.p = TRUE)
#X-squared = 73.523, df = 7, p-value = 2.86e-13

We can try this also (with theoretical definition):

set.seed(1)
observed <- c(2,  1,  4,  2,  2,  2,  1,  1)
prob.exp <- dexp(c(0,  1,  2,  4,  5,  7,  8, 10), rate=0.54)
prob.exp <- prob.exp / sum(prob.exp) # normalize
expected <- sum(observed)*prob.exp
# expected frequency of the values
chisq.stat <- sum((observed-expected)^2/expected)
# [1] 73.52297
1-pchisq(sum(chisq.stat),df=8-1)
# [1] 2.859935e-13

They exactly give the same result, as expected (null hypothesis for goodness of fit test is rejected, so the data is not from the distribution)

akrun
  • 874,273
  • 37
  • 540
  • 662
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
1

You can test for a a log link (i.e. an exponential distribution at the measured level) between the numeric "names" and the observed values of that table of values with an offset of log(rate). If the addition of an offset of log(rate) has an intercept significantly different than 0 then the specific hypothesis is rejected (and it is ... not):

summary( glm( vals ~ nm+offset(rep(0.54, 8)) ,family=poisson))

Call:
glm(formula = vals ~ nm + offset(rep(0.54, 8)), family = poisson)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.9762  -0.3363  -0.1026   0.1976   1.1088  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.36468    0.40787   0.894    0.371
nm          -0.06457    0.08027  -0.804    0.421

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 3.3224  on 7  degrees of freedom
Residual deviance: 2.6593  on 6  degrees of freedom
AIC: 26.38

Number of Fisher Scoring iterations: 4
IRTFM
  • 258,963
  • 21
  • 364
  • 487