0

I have this chunk of code that I want to build into a function where I pass the data frame name, ResponseId variable name, and a range of variables to. It is choking on the ResponseId variable name and the range of variables. I assume it is because I am passing them into the function incorrectly.

Working chunk:


set.seed(1024)

premice_1 <- subset(emu_raw, select=c(ResponseId, PMS_1:PMS_45))
premice_2 <- mice(premice_1, maxit=0, pri=F)
pred <- premice_2$pred
pred[,c("ResponseId")] <- 0
meth <- premice_2$meth
meth[c("ResponseId")] <- ""

postmice_a <- mice(premice_1, m=1, maxit=10, printFlag=TRUE, pred=pred, 
                   meth=meth, seed=4201)
postmice_b <- mice::complete(postmice_a, "long", include=FALSE)
postmice_b$.imp <- NULL
postmice_b$.id <- NULL

emu_raw <- merge(emu_raw, postmice_b)

Failing function, when I call it with this: missing_values(emu_raw, ResponseId, PMS_1:PMS_45) -- says of course that PMS_1 doesn't exist because it seems there's a problem with passing the series of columns

missing_values <- function(df, ResponseId, measure_range) {

set.seed(1024)

premice_1 <- subset(df, select=c(ResponseId, measure_range))
premice_2 <- mice(premice_1, maxit=0, pri=F)
pred <- premice_2$pred
pred[,c("ResponseId")] <- 0
meth <- premice_2$meth
meth[c("ResponseId")] <- ""

postmice_a <- mice(premice_1, m=1, maxit=10, printFlag=TRUE, pred=pred, 
                   meth=meth, seed=4201)
postmice_b <- mice::complete(postmice_a, "long", include=FALSE)
postmice_b$.imp <- NULL
postmice_b$.id <-NULL

return(postmice_b)
}

I've tried addressing the subset with brackets instead of the tidy subset() function, but the same result occurs.

missing_values <- function(df, ResponseId, measure_range) {

set.seed(1024)

#premice_1 <- subset(df, select=c(ResponseId, measure))
premice_1 <- df[,c(ResponseId, measure_range)]
premice_2 <- mice(premice_1, maxit=0, pri=F)
pred <- premice_2$pred
pred[,c(ResponseId)] <- 0
meth <- premice_2$meth
meth[c(ResponseId)] <- ""

postmice_a <- mice(premice_1, m=1, maxit=10, printFlag=TRUE, pred=pred, 
                   meth=meth, seed=4201)
postmice_b <- mice::complete(postmice_a, "long", include=FALSE)
postmice_b$.imp <- NULL
postmice_b$.id <- NULL

return(postmice_b)
}```

`missing_values(emu_raw, "ResponseId", PMS_1:PMS_45)`

Error in `[.tbl_df`(df, , c(ResponseId, measure_range)) :
object 'PMS_1' not found

3. `[.tbl_df`(df, , c(ResponseId, measure_range))
2. df[, c(ResponseId, measure_range)]
1. missing_values(emu_raw, "ResponseId", PMS_1:PMS_45)
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. So you want to pass in `ResponseId` as a character value but `measure_range` as an expression? – MrFlick Mar 02 '23 at 18:24
  • Related: https://stackoverflow.com/questions/11880906/pass-subset-argument-through-a-function-to-subset – MrFlick Mar 02 '23 at 18:26
  • Yes, I do want to pass measure_range as an expression. It is PMS_1 through PMS_45 there are 45 columns of values in the data frame. – JRDubbleu Mar 03 '23 at 01:50

0 Answers0