3

Using the survey package, I am having issues creating an imputationList that svydesign will accept. Here is a reproducible example:

library(tibble)
library(survey)
library(mitools)


# Data set 1
# Note that I am excluding the "income" variable from the "df"s and creating  
# it separately so that it varies between the data sets. This simulates the 
# variation with multiple imputation. Since I am using the same seed
# (i.e., 123), all the other variables will be the same, the only one that 
# will vary will be "income."

set.seed(123)

df1 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Data set 2

set.seed(123)

df2 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Data set 3

set.seed(123)

df3 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


 # Create list of imputed data sets

 impList <- imputationList(df1,
                           df2, 
                           df3)


# Apply NHIS weights

weights <- svydesign(id     = ~id, 
                     weight = ~pweight, 
                     data   = impList)

I get the following error:

Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one
scottsmith
  • 371
  • 2
  • 11
  • Possible duplicate of https://stackoverflow.com/questions/9026383/r-numeric-envir-arg-not-of-length-one-in-predict – zx8754 Jan 28 '18 at 20:01
  • Error is coming from `svydesign`. We don't need to see how you got the data, try to create small [reproducible data](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) that would generate the same error, maybe `dput(head(impList))`. – zx8754 Jan 28 '18 at 20:03
  • 1
    Yes, the error is coming from `svydesign`, but I don't know why. I am following the example in `?imputationList` where `imputationList(datasets,...)`. Usually I do use small reproducible examples, but this is more complicated (e.g., imputed data, survey weights), and I thought it would be best to use real-world data as it is difficult to recreate the exact situation. – scottsmith Jan 28 '18 at 20:12
  • @zx8754 this isn't a duplicate.. the question is specific to `library(survey)` – Anthony Damico Jan 29 '18 at 11:01
  • @AnthonyDamico I said "possible" judging from the error message, so that OP can explore if that linked post is helpful. – zx8754 Jan 29 '18 at 11:03
  • Thanks @zx8754. I added a better reproducible example. – scottsmith Jan 30 '18 at 13:38

2 Answers2

2

To get it to work, I needed to directly add imputationList to svydesign as follows:

weights <- svydesign(id = ~id, 
                         weight = ~pweight, 
                         data = imputationList(list(df1, 
                                                    df2, 
                                                    df3)) 
scottsmith
  • 371
  • 2
  • 11
1

the step by step instructions available at http://asdfree.com/national-health-interview-survey-nhis.html walk through exactly how to create a multiply-imputed nhis design, and the analysis examples below that include svyglm calls. avoid using library(data.table) and library(dplyr) with library(survey)

Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • Thanks @AnthonyDamico. Why avoid using `library(data.table)` and `library(dplyr)` with `library(survey)`? They are my go-to data wrangling libraries. – scottsmith Jan 29 '18 at 14:40
  • neither work with survey design objects. `library(srvyr)` allows some `library(dplyr)` commands – Anthony Damico Jan 29 '18 at 15:34
  • Thanks @AnthonyDamico. I have similar question you might be interested in here https://stackoverflow.com/questions/48506315/marginal-effects-with-survey-weights-and-multiple-imputations – scottsmith Jan 30 '18 at 13:32