0

I am using the following code to generate data, and i am estimating regression models across a list of variables (covar1 and covar2). I have also created confidence intervals for the coefficients and merged them together.

I have been examining all sorts of examples here and on other sites, but i can't seem to accomplish what i want. I want to stack the results for each covar into a single data frame, labeling each cluster of results by the covar it is attributable to (i.e., "covar1" and "covar2"). Here is the code for generating data and results using lapply:

##creating a fake dataset (N=1000, 500 at treated, 500 at control group)
#outcome variable
outcome <- c(rnorm(500, mean = 50, sd = 10),  rnorm(500, mean = 70, sd = 10))

#running variable
running.var <- seq(0, 1, by = .0001)
running.var <- sample(running.var, size = 1000, replace = T)

##Put negative values for the running variable in the control group
running.var[1:500] <- -running.var[1:500]

#treatment indicator (just a binary variable indicating treated and control groups)
treat.ind <- c(rep(0,500), rep(1,500))

#create covariates
set.seed(123)
covar1 <- c(rnorm(500, mean = 50, sd = 10), rnorm(500, mean = 50, sd = 20))
covar2 <- c(rnorm(500, mean = 10, sd = 20), rnorm(500, mean = 10, sd = 30))
data <- data.frame(cbind(outcome, running.var, treat.ind, covar1, covar2))
data$treat.ind <- as.factor(data$treat.ind)

#Bundle the covariates names together
covars <- c("covar1", "covar2")

#loop over them using a convenient feature of the "as.formula" function
models <- lapply(covars, function(x){
  regres <- lm(as.formula(paste(x," ~ running.var + treat.ind",sep = "")), data = d)
  ci <-confint(regres, level=0.95)
  regres_ci <- cbind(summary(regres)$coefficient, ci)
})
names(models) <- covars
print(models)

Any nudge in the right direction, or link to a post i just haven't come across, is greatly appreciated.

  • What is `d` in the code? – m0nhawk Nov 21 '18 at 19:32
  • In the `lm()` call within the `lapply()`, is `d` meant to be `data`? Also, it would help if you could outline the expected output (dimensions and colnames of the expected dataframe) – 12b345b6b78 Nov 21 '18 at 19:33
  • good points above, I'm guessing something like `models %>% purrr::map_df(broom::tidy, .id = "covar_id")` will get close to what you want – Nate Nov 21 '18 at 19:35

1 Answers1

1

You can use do.call were de second argument is a list (like in here):

do.call(rbind, models)

I made a (possible) improve to your lapply function. This way you can save the estimated parameters and the variables in a data.frame:

models <- lapply(covars, function(x){
  regres <- lm(as.formula(paste(x," ~ running.var + treat.ind",sep = "")), data = data)
  ci <-confint(regres, level=0.95)
  regres_ci <- data.frame(covar=x,param=rownames(summary(regres)$coefficient),
                          summary(regres)$coefficient, ci)
})

do.call(rbind,models)
P. Paccioretti
  • 414
  • 4
  • 11