I'm conducting a Monte Carlo study. I have a linear model with heteroskedasticity and left censoring of the dependent variable at 0. The mean of censoring rates is 25.9.
I get the error
Error in lm.fit(X.vlm, y = z.vlm, ...) : NA/NaN/Inf in 'x'
after trying to estimate a tobit model.
vglm(y[i,]~X[1,i,]+X[2,i,]+X[3,i,]+X[4,i,],family=tobit(Lower=0))
My data are simulated from standard distribution so the problem shoudn't come from odd variables.
I found two other questions that had the same problem with real data : lm() NA/NaN/Inf error , lm() NA/NaN/Inf error But there didn't seem to be any satisfying answers. Besides my data are easily reproducible so it should help identifying the problem
Here are the codes :
library(VGAM)
set.seed(12345)
nobs=100
nsim=100
b=c(2,-2,-3,3)
g=c(1,0.2)
y=matrix(rep(0,nobs*nsim),ncol=nobs,nrow=nsim)
X=array(0,dim=c(4,nsim,nobs))
res=matrix(rep(0,nobs*nsim),ncol=nobs,nrow=nsim)
tobit=vector(mode="list",length=nsim)
for(i in 1:nsim){
# generate covariates :
X[1,i,]=rlnorm(n=nobs)
X[2,i,]=runif(n=nobs)<=.75
X[3,i,]=rnorm(mean = 3,n=nobs)
X[4,i,]=runif(n=nobs,min=0,max=10)
res[i,]=(g[1]+g[2]*X[4,i,])*rnorm(n=nobs)
# generate censored dependent variable
y[i,]=b[1]*X[1,i,]+b[2]*X[2,i,]+b[3]*X[3,i,]+b[4]*X[4,i,]+res[i,]
y[i,]=sapply(y[i,],FUN=function(x){max(0,x)}) #apply censoring
tobit[[i]]<-vglm(y[i,]~X[1,i,]+X[2,i,]+X[3,i,]+X[4,i,],
family = tobit(Lower=0))
}
Here is the traceback
traceback()
5: lm.fit(X.vlm, y = z.vlm, ...)
4: vlm.wfit(xmat = X.vlm.save, z, Hlist = NULL, U = U, matrix.out =FALSE,
is.vlmX = TRUE, qr = qr.arg, xij = NULL)
3: vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2,
Ym2 = Ym2, etastart = etastart, mustart = mustart, coefstart =coefstart,
family = family, control = control, constraints = constraints,
criterion = control$criterion, extra = extra, qr.arg = qr.arg,
Terms = mt, function.name = function.name, ...)
2: vglm(y[1, ] ~ X[1, 1, ] + X[2, i, ] + X[3, i, ] + X[4, i, ],
family = tobit(Lower = 0))
1: traceback(vglm(y[1, ] ~ X[1, 1, ] + X[2, i, ] + X[3, i, ] + X[4,
i, ], family = tobit(Lower = 0)))
*** Edit :
By removing one covariate (I tried with X[3,i,] and X[4,i,]) and setting the lower censoring at -0.001 as BondedDust suggest, It works fine and I even push the number of replications to 1000 without major problems.
By just setting the lower censoring at -0.001, and keeping all the covariates, I get two errors out of 100 iterations. It is worth noting that the error is now
Error in lm.fit(X.vlm, y = z.vlm, ...) : NA/NaN/Inf in 'y'
Besides I get these warnings
In vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2, ... :
iterations terminated because half-step sizes are very small