1

I've written the following model in R:

model1_b <- glm(bupacts ~ sex + couples + women_alone, data = risky, family = poisson(link="log"))

I would like to find the coefficient estimates for this model using optim. That is, I would like to get to the point where the coefficients produced by the above model match those produced using optim. Here is the code for the likelihood function:

log.like <- function(par) {
    xb <- par[1] + par[2] * risky$sex + par[3] * risky$couples + par[4] * risky$women_alone
   lambda <- exp(xb)
    ll <- sum(log(exp(-lambda) * (lambda ^ risky$bupacts) / factorial(risky$bupacts)))
    return(-ll)}
result <- optim(c(0, 0, 0, 0), log.like, hessian = TRUE, method = "BFGS")

I keep getting the error: initial value in 'vmmin' is not finite. Could someone help me out here? I'm not sure what's going on.

w5698
  • 159
  • 7
  • 1
    this is a fine question, but can we please have a [mcve]? Do you have an objection to using `dpois()` instead of your hand-rolled Poisson log-likelihood? – Ben Bolker Apr 01 '21 at 15:10
  • I do not have an objection to that. What would the function look like with `dpois()`? – w5698 Apr 01 '21 at 15:13
  • you can replace your `ll` line with `ll <- sum(dpois(risky$bupacts, lambda, log=TRUE))`. (It *might* fix your problem since you wouldn't be computing the likelihood before log-transforming ...) We'd still like a reproducible example to diagnose your starting-value problem ... – Ben Bolker Apr 01 '21 at 15:20
  • Thanks -- I appreciate your help. I will try this. To clarify, how is what I posted not a reproducible example? I looked at the link you shared, but it's not clear to me what else I should add. – w5698 Apr 01 '21 at 15:23
  • we don't have your data (`risky`). You can either post your data (or a subset of it that still generates the problem), or try to find a built-in or simulated data set that also replicates the problem. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example is the canonical R-specific Q about reproducibility ... – Ben Bolker Apr 01 '21 at 15:27

1 Answers1

1

Use the negative log likelihood function shown below.

library(foreign)

# input
u <- file.path("http://www.stat.columbia.edu", 
  "~gelman/arm/examples/risky.behavior/risky_behaviors.dta")
risky <-  read.dta(u, convert.factors = TRUE)

# glm
model1_b <- glm(bupacts ~ sex + couples + women_alone, data = risky, 
  family = poisson(link = "log"))
coef(model1_b)
##   (Intercept)        sexman       couples   women_alone 
##  3.1743188102  0.0474967008  0.1448180037 -0.0007948884 

## glm using optim
neg.LL <- function(p) with(risky, {
   eta <- p[1] + p[2] * (sex == "man") + p[3] * couples +  p[4] * women_alone
  -sum(dpois(bupacts, lambda = exp(eta), log = TRUE))
})
fm <- lm(bupacts ~ sex + couples + women_alone, risky)
optim(coef(fm), neg.LL, method = "BFGS")

giving:

$par
  (Intercept)        sexman       couples   women_alone 
 3.1744020275  0.0474937915  0.1447499657 -0.0009630441 

$value
[1] 7083.872

$counts
function gradient 
     178       37 

$convergence
[1] 0

$message
NULL
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • There was a problem with the coding of the sex variable. Have corrected. – G. Grothendieck Apr 01 '21 at 16:15
  • Note that this also works with starting value `rep(0,length(coef(fm)))`, which is closer to the OP's specific question ... – Ben Bolker Apr 01 '21 at 23:03
  • It seems the `lm` coefficients are not all that close to the `glm` coefficients and, in fact, with zero starting values it converges in half the iterations. Even fewer if we start from all ones. – G. Grothendieck Apr 02 '21 at 02:09