0

I'm a newbie to R with a question about poLCA. When you run a latent class analysis in poLCA it generates a value for each respondent giving their posterior probability of 'belonging' to each latent class. These are stored as a matrix in the element 'posterior'.

I would like to create a dataframe which contains each respondent's unique ID number (which is stored as a variable in the dataframe used for poLCA) and their matched posterior probability from the 'posterior' matrix. I would then like to write this dataframe to a csv file for use in another program.

I know this is possible, but I just can't seem to get it right (blame my incompetence with R). Any help would be very warmly appreciated.

Best wishes. Robert de Vries

EDIT - EXAMPLE

#Loading the poLCA package
library(poLCA)

#The ID variable
serial <-sample(1:1000,100,replace=F)

#The variables used in the latent class model
V1 <- sample(1:2,100,replace=T)
V2 <- sample(1:2,100,replace=T)
V3 <- sample(1:2,100,replace=T)
V4 <- sample(1:2,100,replace=T)

#the data given to the lca
lcadata <- data.frame(serial,V1,V2,V3,V4)

#the lca formula
f <- cbind(V1,V2,V3,V4) ~1

#A 2 class LCA model in poLCA
lca2 <- poLCA(f,lcadata,nclass=3,maxiter=5000,nrep=10)
rdevries
  • 15
  • 4
  • 2
    Can you provide a snippet of your data or a toy data set on which we could run `poLCA()` to help answer your question? This should be pretty straightforward, but it's hard to get it exactly right without a reproducible example. – ulfelder Aug 03 '15 at 16:22
  • Welcome to SO and thank you for posting a question. Please include the code that you've tried and what error messages you've got. An [MVCE](http://stackoverflow.com/help/mcve) would be a great way to do this and there is a question [that demonstrates how](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Richard Erickson Aug 03 '15 at 16:22
  • I have the feeling that the problem is just a matter of subsetting on object which is a result of a function call. Could you just provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and help us to help you? – SabDeM Aug 03 '15 at 16:22
  • Thanks for the prompt replies. I've added an example of the code I'm using. This code will run a 2 class LCA model on the 100 hypothetical observations. It will store the posterior probability that each case 'belongs' in each of the two classes in the matrix 'lca2$posterior'. It's these that I want to match with the 'serial' ID numbers in a new dataframe. – rdevries Aug 03 '15 at 17:19
  • The poLCA object does not store the serials, you just have to trust that the posteriors are in the same order as the input dataframe. That said, you can write the results out with `write.csv(cbind(lcadata[,"serial"],lca2$posterior),file="output.csv")` ; remove the `[,"serial"]` if you also want the variables V1...V4 in the output. – scoa Aug 03 '15 at 18:53

1 Answers1

0

Something as simple as

write.csv(cbind(lcadata$serial, lca2$posterior), 'new_data.csv', row.names=FALSE)

should work.

scribbles
  • 4,089
  • 7
  • 22
  • 29
  • Thanks - that worked. I've also done a quick manual sense check on whether the posteriors match what they should be from the relevant variable values - it seems to match up. However, is there any better way of checking whether (as @scoa notes) the posteriors are in the same order as the input dataframe? – rdevries Aug 04 '15 at 08:45