1

I'm trying to use mice to impute missing data for a genotype matrix. I'm getting the dim(X) must have positive length error, similar to here: dim(X) must have a positive length when using mice function. In addition, I get the following warning message: In nnet:multinom(formula(xy), data=xy[ry, , drop = FALSE],weights = w[ry], : group '2' is empty

So here's some context: The data is inputted in the form of a data frame. All of the variables except the last one is categorical with 3 categories: 0, 1 and 2 (as well as the missing value NA). 2 is the rarest value though, and in some variables it won't show up at all. The last variable is categorical with 2 categories: 0 and 1, and has no missing data. After saving the data into the variable data0, I called mice like this data1 <- complete(mice(data0,method="polyreg"),action=1)

I thought that this error might be caused by my attempting to force mice to have 2 as a valid level for variables where 2 failed to appear. I tried to make a minimal reproducible example off that:

library('mice')
A <- rbind(c(0,0,0),c(1,1,1),c(2,2,NA))
data0 <- data.frame(factor(A[,1],levels=c(0,1,2)))
data0 <- data.frame(data0,factor(A[,2],levels=c(0,1,2)))
data0 <- data.frame(data0,factor(A[,3],levels=c(0,1,2)))
colnames(data0) <- paste0('V',1:3)
data1 <- complete(mice(data0,method='polyreg'),action=1)

That gives me the error object .Random.seed not found

I guess I should ask first: What's wrong with this simple example? And then, if possible, how to fix the real case.

Thanks

J.D.
  • 139
  • 4
  • 14
  • Hi J.D., it will be easier to help if you provide at least a sample of `data0` with `dput(data0)`. – Ian Campbell Mar 27 '20 at 02:27
  • Categorical by R definition is "factor". So if the classes of your variables are not factor, then R (and `mice`) will assume they are not categorical. – Edward Mar 27 '20 at 03:11
  • @IanCampbell @Edward ```dput(data0)``` outputs structure(list(V1=c [tens of thousands of lines, most of say 0L, 0L, 0L, 0L, …. with an occasional 1L] Not sure what a "factor" is. How do I make the classes of my variables "factor"? On a different note, is there a way to reduce the display limit – J.D. Mar 27 '20 at 04:05
  • @J.D. In most cases there's no need to share an entire data set. Take a minute and learn how to make a minimal, self-contained, reproducible example, read: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610 – jay.sf Mar 27 '20 at 13:37

0 Answers0