2

This is the script I am working on:

library(poLCA)

f <- cbind(bq70, bq72_1, bq72_2, bq72_3, bq72_4, bq72_5, 
           bq72_6, bq72_7, bq73a_1, bq73a_2, bq73a_3, bq73a_4) ~ 
     zq88 + zq89 + dm_zq101_2 + dm_zq101_3 + dm_zq101_4 + 
     dm_zq101_5 + dm_zq101_6 + dm_zq101_7 + dm_zq101_8 + dm_zq101_9

for(i in 2:14){
  max_II <- -1000000
  min_bic <- 100000

  for(j in 1:1024){
    res <- poLCA(f, BESDATA, nclass=i, maxiter=1000, 
                 tol=1e-5, na.rm=FALSE, probs.start=NULL, 
                 nrep=1, verbose=TRUE, calc.se=TRUE)
    if(res$bic < min_bic){
      min_bic <- res$bic
      LCA_best_model<-res
    }
  }
}

I would like to perform a latent class analysis, and also with a regression. However, the above code takes my pc a very long time to complete (intel core i5 4690k, 16gb ram).

Is it typical for poLCA to take this long?

Also, is there a line of code that I can use that will stop the loops for each class once global maximum likelihood has been reached?

N = around 2000.

I use R studio by the way, in case it matters!

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Steven
  • 21
  • 2
  • These are mostly programming questions better suited to stackoverflow. However, here are a couple of observations about the statistical aspects. 1. The lowest BIC is not usually the 'maximum likelihood [value of the data under nclass]'. 2. You can't know if or when the global ML (or the lowest BIC) is found, only that the current one is better than previous ones. – conjugateprior Jun 11 '15 at 13:30
  • And a couple about the code: 1. Skimming the package documentation, you can replace the entire `j` loop by setting `nrep=1024`. The best model is returned as `res`. After it's fitted, do your `min_bic` test, 2. You don't use `max_II` anywhere. – conjugateprior Jun 11 '15 at 13:39

1 Answers1

1

Yes, the function can run slowly if you have a large dataset, or a complex model like the one you’ve specified here.

To speed things up, I’d suggest eliminating the j loop, and instead set nrep=30 (say). That will automate the search for the global maximum likelihood at each potential number of latent classes (2 to 14). My guess is you’ll find that you don’t need to run each model specification 1000+ times to find the global maximum.

Then, compare the BICs from the fitted models for each number of LCs to help choose the specification with the best number of classes. Don’t only rely on the BICs, though. The class-conditional response probabilities should also be looked at to see which model specification is the most substantively useful or meaningful for your application.

Drew
  • 11
  • 3
  • Thanks to both of you for the help. I have edited the script now and it is much faster. However, the code that was used to determine the lowest BIC and select that class in the output is now not working. It just chooses the last class tested (eg.14). Anyway thanks again! – Steven Jun 14 '15 at 18:55