4

First I am using "rmutil" package to make simulation of double poisson distributed data. The difference of poisson and double poisson is that, double poisson allows overdispersion and underdispersion where mean and variance not necessary equal.

This link shows the function of double poisson distribution: http://ugrad.stat.ubc.ca/R/library/rmutil/html/DoublePoisson.html

I have simulated a set of data with size of 500.

set.seed(10)
library("rmutil")

nn = 500 #size of data
gam = 0.7 #dispersion parameter
mu = 11

x <- rdoublepois(nn, mu, gam)

head(x)

[1] 11 9 10 13 6 8

 mean(x) #mean
 mean(x)/var(x) #dispersion

Below are the true value of parameters:

mean(x) #mean

[1] 10.986

mean(x)/var(x) #dispersion

[1] 0.695784

To obtain the parameter by MLE, I used nlminb function to maximize log likelihood function. The log likelihood function is formed by the density function of double distribution in "rmutil" package.

logl <- function(par) {
  mu.new <- par[1]
  gam.new <- par[2]

  -sum(ddoublepois(x, mu.new, gam.new, log=TRUE))
 }
nlminb(start = c(0.1,0.1), lower = 0, upper = Inf, logl)

It came out error:

Error in ddoublepois(x, mu.new, gam.new) : s must be positive

So i make another attempt, I type in the equation of double poisson density function.

logl2 <- function(par) {
  mu.new <- vector() #mean
  gam.new <- vector() #dispersion
  ddpoi <- vector()


for (i in 1:nn){    
    ddpoi[i] <- 0.5*log(gam.new[i])-gam.new[i]*mu.new[i]
    +x[i]*(log(x[i])-1)-log(factorial(x[i]))
    +(gam.new[i])*x[i]*(1+log(mu.new[i]/x[i]))
  }
  -sum(ddpoi)
 }
nlminb(start = c(0.1,0.1), lower = 0, upper = Inf, logl2)

Output:

nlminb(start = c(0.1,0.1), lower = 0, upper = Inf, logl2)

$par

[1] 0.1 0.1

$objective

[1] Inf

$convergence

[1] 0

$iterations

[1] 1

$evaluations

function gradient

2 4

$message [1] "X-convergence (3)"

Definitely, estimated parameters of 0.1 (same with initial value) shows that this code fails.

Anyone could show me how to do correct maximum likelihood estimation for double poisson distribution?

Thanks in advance.

Miyazaki
  • 99
  • 8

1 Answers1

4

Your problem is that nlminb is trying to evaluate the function on the boundary (i.e. s exactly equal to 0).

One way to figure this out is to modify logl to include debugging statements:

logl <- function(par,debug=FALSE) {
    mu.new <- par[1]
    gam.new <- par[2]
    if (debug) cat(mu.new,gam.new," ")
    r <- -sum(ddoublepois(x, m=mu.new, s=gam.new,log=TRUE))
    if (debug) cat(r,"\n")
    return(r)
}
nlminb(start = c(0.1,0.1), lower = 0, upper = Inf, logl, debug=TRUE)
## 0.1 0.1 3403.035 
## 0.1 0.1 3403.035 
## 0.1 0.1 3403.035 
## 1.022365 0 Error in ddoublepois(x, m = mu.new, s = gam.new, log = TRUE) : 
## s must be positive

Now try it with the boundary displaced slightly from zero:

nlminb(start = c(0.1,0.1), lower = 1e-5, upper = Inf, logl)

Gives reasonable answers

## $par
## [1] 10.9921451  0.7183259
## ...
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Dear @Ben Bolker, thank you for for your help. I still need to read more about the debug statement. I stuck at this problem for about 1 week and you have saved me! Hope that this answer could help more people. – Miyazaki May 07 '18 at 04:07
  • Hi @Ben Bolker I know this is an old question but could you possibly comment on why using the density function dDPO in the gamlss package gives a different result for the dispersion parameter compared to ddoublepois in the rmutil package? I can setup a new question if that would be easier? – Constantin Mar 24 '22 at 15:18
  • 1
    Never mind, I figured it out. Sigma in the gam.lss package = 1/dispersion parameter in the rmutil package. – Constantin Mar 24 '22 at 15:30