3

I'v been working with this data

year rango_edad Sexo zona_2016 conteo siniestros expuestos upc_millon valor_millon freq 1 2010 01. < 1 F Alejada 180 87 75 121 111 0.48 2 2010 01. < 1 F Ciudades 103453 76219 40228 60755 84981 0.74 3 2010 01. < 1 F Especial 5129 3194 2078 3289 3013 0.62 4 2010 01. < 1 F Normal 27393 18436 10735 15656 16692 0.67 5 2010 01. < 1 M Alejada 185 98 73 116 110 0.53 6 2010 01. < 1 M Ciudades 106915 80731 41719 62991 105135 0.76 costo.medio ratio 1 1.27 0.92 2 1.11 1.40 3 0.94 0.92 4 0.91 1.07 5 1.12 0.94 6 1.30 1.67 and I'm trying to model the frequency with gamlss

gamlss(freq~Sexo+zona_2016+rango_edad,family=PO(mu.link = "log"),data=na.omit(subset(datos,is.na(freq)==FALSE ))) gamlss(freq~Sexo+zona_2016+rango_edad,family=NBI(mu.link = "log"),data=na.omit(subset(datos,is.na(freq)==FALSE )))

but I received this error message

Error in while (abs(G.dev.old - G.dev) > c.crit && iter < n.cyc) { : missing value where TRUE/FALSE needed

how can I solve that?

James
  • 237
  • 1
  • 12
bubleskmy
  • 131
  • 1
  • 9

2 Answers2

2

The response variable is NOT a count but a frequency with values from 0 to 1. An appropriate model for this response (target) variable is the beta distribution. Please try family=BE.

Valentin Ruano
  • 2,726
  • 19
  • 29
  • You are right, it doesn't make sense to use Poisson if the response is a proportion between 0 and 1. That being said, there seems to be a bug in gamlss that doesn't allow for fractional response values even if they are greater than 1. – James Jun 22 '18 at 14:40
  • In addition, Negative Binomial has the same problem. – James Jun 26 '18 at 17:20
1

I got a similar error and apparently it's caused by using fractional response values. E.g. in the code below case 1 is fine, but cases 2-4 fail:

resp1 <- rep(1, 6)
trt <- c("A", "A", "A", "B", "B", "B")
ftd1 <- gamlss(resp1 ~ trt, family = PO(mu.link = "log"))
resp2 <- rep(0.0001, 6)
ftd2 <- gamlss(resp2 ~ trt, family = PO(mu.link = "log"))
resp3 <- resp1
resp3[6] <- 0.0001
ftd3 <- gamlss(resp3 ~ trt, family = PO(mu.link = "log"))
resp4 <- resp1
resp4[6] <- 1.75
ftd4 <- gamlss(resp4 ~ trt, family = PO(mu.link = "log"))

To answer your question directly, use glm() or glm2() until after gamlss developers fix that (I sent them a link to this post). However, as the other answer pointed out, if your response is a proportion between 0 and 1 it doesn't make sense to fit Poisson in the first place.

James
  • 237
  • 1
  • 12