Using nest and tidy to create formatted html regression output in RMarkdown

Question

I have used tidyr::nest to run a series of logistic regression models with different dependent variables. I want to output the results as a single html table in RMarkdown with each model as a column and the rows as exponentiated coefficients with 99% CIs. I cannot figure out the workflow between nest, tidy, and a table-rendering package like stargazer to get this to work. If I unnest my tidy output and pass it to stargazer, or if I just try to pass the nested output (the variable in the nested data frame called "model" below) to stargazer directly, I get no output. I would prefer to use the tidy output anyway because of the exponentiated coefficients and 99% CIs. I essentially need this vignette to go one step farther and explain how to use the output of nest and tidy to create formatted regression tables. I also looked at this SO post, but I find it hard to believe that there isn't an easier way to do this that I am just missing.

Sample data, along with my general approach to running the models:

id <- 1:2000
gender <- sample(0:1, 2000, replace = T)
age <- sample(17:64, 2000, replace = T)
race <- sample(0:1, 2000, replace = T)
cond_a <- sample(0:1, 2000, replace = T)
cond_b <- sample(0:1, 2000, replace = T)
cond_c <- sample(0:1, 2000, replace = T)
cond_d <- sample(0:1, 2000, replace = T)

df <- data.frame(id, gender, age, race, cond_a, cond_b, cond_c, cond_d)

df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>% 
  group_by(condition) %>% nest() %>% 
  mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
         tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))

i don't get your question.. you can just unnest the tidy part and you get the table of regression coefficients? — StupidWolf, Feb 12 '21 at 18:49
Sorry, I am struggling to figure out how to explain my problem. I edited the post to try to be clearer about where things are breaking down for me. — Kellan Baker, Feb 12 '21 at 19:02

StupidWolf · Accepted Answer · 2021-02-13T10:18:19.007

Hope I get it correct, essentially you pass the lm objects and the confidence interval separately into stargazer, using this answer from your linked question.

You can read the stargazer help page for how to input for example custom ci would require:

ci.custom: a list of two-column numeric matrices that will replace the default confidence intervals for each model. The first and second columns represent the lower and the upper bounds, respectively. Matched by element names.

So in your case, it's a little bit more work, we store the results first.

result = df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>% 
  group_by(condition) %>% nest() %>% 
  mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
         tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))

tidy_model = result$tidy
fit = result$model

Then pull out CI and coefficients:

CI = lapply(tidy_model,function(i)as.matrix(i[,c("conf.low","conf.high")]))
Coef = lapply(tidy_model,"[[","estimate")

Then apply stargazer:

stargazer(fit, type = "text", 
  coef = Coef,
  ci.custom = CI)


=============================================================================
                                      Dependent variable:                    
                  -----------------------------------------------------------
                                             case                            
                       (1)            (2)            (3)            (4)      
-----------------------------------------------------------------------------
gender               0.996***       1.182***       1.196***       0.921***   
                  (0.790, 1.254) (0.938, 1.489) (0.950, 1.508) (0.731, 1.161)
                                                                             
age                  1.004***       1.001***       0.999***       1.005***   
                  (0.995, 1.012) (0.993, 1.009) (0.990, 1.007) (0.996, 1.013)
                                                                             
race                 0.911***       0.895***       0.944***       1.213***   
                  (0.724, 1.148) (0.711, 1.128) (0.749, 1.189) (0.963, 1.529)
                                                                             
Constant             0.919***       0.997***       0.959***       0.761***   
                  (0.623, 1.356) (0.676, 1.472) (0.649, 1.415) (0.515, 1.123)
                                                                             
-----------------------------------------------------------------------------
Observations          2,000          2,000          2,000          2,000     
Log Likelihood      -1,385.107     -1,382.664     -1,383.411     -1,382.272  
Akaike Inf. Crit.   2,778.215      2,773.329      2,774.821      2,772.544   
=============================================================================
Note:                                             *p<0.1; **p<0.05; ***p<0.01

this is incredibly helpful, thank you. I guess I am surprised that there is not a tidyverse way of exporting the output from models stored in a nested data frame to regression tables. I am not wedded to stargazer, it just has worked a bit better for me than others I've tried, which have included gtsummary, kable, and jtools::export_summs. jtools::export_summs gives me the right coefficients, etc, but it can only work with model objects themselves, which don't seem to be preserved anywhere in the nested data frame. — Kellan Baker, Feb 13 '21 at 16:50
in your case, you need something more custom, and tidy works with long format data, so not the most straightforward i would guess. I don't work much with Rmarkdown or regression output, so not the best person to comment on how to use tidy with these — StupidWolf, Feb 13 '21 at 17:33

Using nest and tidy to create formatted html regression output in RMarkdown

1 Answers1