1

I want to "stratify-then-impute" using the packages available in R.

That is, I am hoping to: 1) stratify my dataset using a binary variable called "arm". This variable has no missing data. 2) run an imputation model for the two subsets 3) combine the two imputed data sets 4) run a pooled analysis.

My dataset looks like:

dataSim <- structure(list(pid = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
            13, 14, 15, 16, 17, 18, 19, 20), arm = c(0, 0, 0, 0, 0, 0, 0, 
            0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), X1 = c(0.1, NA, 0.51, 
            0.56, -0.82, NA, NA, NA, -0.32, 0.4, 0.58, NA, 0.22, -0.23, 1.49, 
            -1.88, -1.77, -0.94, NA, -1.34), X2 = c(NA, -0.13, NA, 1.2, NA, 
            NA, NA, 0.02, -0.04, NA, NA, 0.25, -0.81, -1.67, 1.01, 1.69, 
            -0.06, 0.07, NA, -0.11)), .Names = c("pid", "arm", "X1", "X2"
             ), row.names = c(NA, 20L), class = "data.frame")  

To impute, the data, I'm currently using the mi() function as follows:

 library(mi)

data.1 <-  dataSim[dataSim[,"arm"]==1,]
data.0 <- dataSim[dataSim[,"arm"]==0,]

data.miss.1 <- missing_data.frame(data.1)
data.miss.0 <- missing_data.frame(data.0)

imputations.1 <- mi(data.1, n.iter=5, n.chains=5, max.minutes=20, parallel=FALSE)
imputations.0 <- mi(data.0, n.iter=5, n.chains=5, max.minutes=20, parallel=FALSE)

complete(imputations.1)   # viewing the imputed datasets
complete(imputations.0)

Then I don't know how to combine the 2 imputations in order to do a pooled analysis. I have unsuccessfully tried:

imputations <-  rbind(imputations.0, imputations.1)  # This doesn't work

# analysis.X1 <- pool(X1 ~ arm, data = imputations ) # This is what I want to run

I assume this method is a simplified version of including an interaction term when imputing, but I don't know how this is possible either.

Thanks

VTate
  • 11
  • 3
  • 3
    what do you mean by " I don't know how to combine these results to do a pooled analysis?" Does this mean you don't know how to combine the results or you don't know if you are using the proper statistical methodology after the combination? If the latter, you should refocus the question on this and ask it on [crossValidated], which is the proper place (http://stats.stackexchange.com/) for stats methodology questions. Such questions are off topic on SO. – lmo Aug 03 '16 at 12:22
  • I don't know how to combine them. E.g. for the mice() command there is the ibind() function -which allows you to combine imputations from the same dataset, but I have two different datasets. I'm happy with the methodology - i just don't know how to implement it. – VTate Aug 03 '16 at 12:31
  • please show your `library()` statements, and see [how to make a great R reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – C8H10N4O2 Aug 03 '16 at 15:03
  • I've now added library(mi) and tried to make my code clearer. Sorry for not doing this previously. – VTate Aug 04 '16 at 09:25
  • I suggest you email the package authors. The information is not in their vignette. The `mi::complete` command does not suffice. and `str(imputatons.0@data.....)` is not easily interpreted. – alexwhitworth Aug 06 '16 at 19:39

0 Answers0