calculate lm for several combinations of grouping variables

Question

I have a data frame with a dependent variable "my_depend" and a predictor "my_predict", a time variable "time" with two levels "20221" and "20222" and several moderators "mod1-mod5". each of the moderators has a different number of levels. For each combination of the levels of time and mod1 - mod5 I want to calculate and save the slope of the linear model predicting "dependend" from "predict". Any ideas how to approach this?

Welcome to SO! Please provide a [minimal reproducible example of your code and data along with relevant errors](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). [Edit] your question to include a [mre] so that readers can run your code to answer your question. — Maël, Aug 03 '23 at 09:24

jpsmith · Answer 1 · 2023-08-03T11:36:05.703

You can create a vector of all the right hand side (rhs) combinations, then use lapply to run all the models.

Reassigning the inherent mtcars dataset to df and renaming the columns per your question:

df <- mtcars[,1:8]
names(df) <- c("my_depend", "my_predict", "time", paste0("mod", 1:5))

Create a vector of combinations to put on the rhs of the glm formula and use it in lapply

vars <- names(df[-1])
# [1] "my_predict" "time"       "mod1"       "mod2"       "mod3"       "mod4"       "mod5"

#### all combinations of `vars`

rhs_prep <- unlist(lapply(seq(vars), combn, x = vars, paste, collapse = ' + '))
#[1] "my_predict"                                           "time"                                                 "mod1"                                                
#[4] "mod2"                                                 "mod3"                                                 "mod4"                                                
#[7] "mod5"                                                 "my_predict + time"                                    "my_predict + mod1"                                   
#[10] "my_predict + mod2"                                    "my_predict + mod3"                                    "my_predict + mod4"   
#...
#[124] "my_predict + time + mod2 + mod3 + mod4 + mod5"        "my_predict + mod1 + mod2 + mod3 + mod4 + mod5"        "time + mod1 + mod2 + mod3 + mod4 + mod5"             
#[127] "my_predict + time + mod1 + mod2 + mod3 + mod4 + mod5"

#### restrict to only those with `my_predict` and `time`

rhs_final <- rhs[grep("my_predict \\+ time", rhs)]
#[1] "my_predict + time"                                    "my_predict + time + mod1"                             "my_predict + time + mod2"                            
#[4] "my_predict + time + mod3"                             "my_predict + time + mod4"                             "my_predict + time + mod5"                            
#....
#[31] "my_predict + time + mod2 + mod3 + mod4 + mod5"        "my_predict + time + mod1 + mod2 + mod3 + mod4 + mod5"

final_models <- lapply(rhs_final, function(x) 
  glm(as.formula(paste0("my_depend ~ ", x)), data = df))

The object final_models contains all the information for the various model combinations as a list.

calculate lm for several combinations of grouping variables

1 Answers1