I've built a rudimentary function to extract AIC and BIC values from 3 models I'm interested for several variables. However while it's running my computer often stops and says that it can't allocate 200MB to a vector (I'm using a large dataset- more than 500K cases and yes I've increased the memory limit to the max-4000).
I have actually managed to run it if I select a couple of variables at a time. I'm interested in actually running the function in one go but also improving my function code so that I don't have to delete everything else before running it and possibly not have to wait 30 minutes. I'm likely to use amended AIC and BIC formulas and add other things, so I'd rather keep the AIC and BIC vectorisation as it is and not switch to other logistic regression functions. I've played around with it and added things like rm(model1) but it probably makes very little difference. Would you be able to suggest code which solves the memory allocation problem and possibly speed the up the function?
Many thanks
The function:
myF<-function(mydata,TotScore,group){
BIC2<-BIC1<-BIC0<-AIC2<-AIC1<-AIC0<-rep(NA,length(ncol(mydata)))
for (i in (1:ncol(mydata))){
M0<-glm(mydata[,i] ~ TotScore,family=binomial,data=mydata,x=F,y=F,model=F)
AIC0[i]<-extractAIC(M0)[2]
BIC0[i]<-extractAIC(M0,k=log(length(M0$fitted.values)))[2]
rm(M0)
M1<-glm(mydata[,i] ~ TotScore+group,family=binomial,data=mydata,x=F,y=F,model=F)
AIC1[i]<-extractAIC(M1)[2]
BIC1[i]<-extractAIC(M1,k=log(length(M1$fitted.values)))[2]
rm(M1)
M2<-glm(mydata[,i] ~ TotScore+group+TotScore*group,family=binomial,data=mydata,x=F,y=F,model=F)
AIC2[i]<-extractAIC(M2)[2]
BIC2[i]<-extractAIC(M2,k=log(length(M2$fitted.values)))[2]
rm(M2)
}
Results<-cbind(AIC0,AIC1,AIC2,BIC0,BIC1,BIC2)
rownames(Results)<-names(mydata)
return(Results)
}
P.S. The model can be tried with
##Random dataset example
v1<-sample(0:1, 500000, replace=TRUE, prob=c(.80,.20))
v2<-sample(0:1, 500000, replace=TRUE, prob=c(.85,.15))
v3<-sample(0:1, 500000, replace=TRUE, prob=c(.95,.05))
mydata<-as.data.frame(cbind(v1,v2,v3))
TotScore=rowSums(mydata)
group<-(rep (1:5,100000))
myF(mydata,TotScore,group)