1

Is there a simple way to extract residuals from a list of lm objects generated by lapply()?

I am attempting to run separate lm models to correct for covariates for a list of variables in a dataset split by sex and event. I've been able to use lapply() to create a list of lm objects for each variable and data subset, and I would now like to extract the residuals from these models to join with the original data frame. I have found a long form way to do this, but I'm sure there's a simpler solution.

UPDATE/EDIT This produces a simplified data set with the same structure for three variables and one covariate:

set.seed(42)
n <- 6
dat <- data.frame(id=rep(c(1:n),3), 
                  sex=rep(rep(c("F","M"),n/2),3),
                  event=c(rep(1,n),rep(2,n),rep(3,n)),
                  a=rnorm(n*3),
                  b=rnorm(n*3),
                  e=rnorm(n*3),
                  y=rnorm(n*3))
dat$sex.event<-paste0(dat$sex,dat$event)

myvars<-as.list(c("a","b","e"))

by.sex.event<-split(dat,dat$sex.event)

my_reg<-function(x){lapply(by.sex.event,function(dd)lm(get(x)~y,
                                                       na.action=na.exclude,data=dd))}

models<-lapply(myvars,my_reg)

The long form solution I have found is to run this line separately for each variable (a/models[[1]], b/models[[2]], e/models[[3]]) and data subset ($F1, $F2, $F3, $M1, $M2, $M3):

## rejoin residuals with data by sex.event
by.sex.event$F1$a_resid<-residuals(models[[1]]$F1)

To then transform back to the original data structure with residuals for each variable added using:

## return data to original (unsplit) form
dat<-do.call("rbind",by.sex.event)

I hope the clarifies the structural issues I'm encountering!

mwalesp
  • 11
  • 3
  • 1
    `lapply(models, "[[", "residuals")` will give you a list of the `residuals` component of each model. – Gregor Thomas Apr 04 '23 at 18:00
  • This was helpful but didn't quite achieve my aims, given that the 'models' list has an additional layer ([[1]]:[[13]] each with F1,F2,F3,M1,M2,M3 for the sex.event split data). For instance `lapply(models[[1]], "[[", "residuals")` did give me a list of residuals for the first variable, with the list split by sex.event. `by.sex.event$F1$PDS_p_resid2<-residuals(models[[1]]$F1)` is the most accurate way I've found to rejoin this with the original data frame, but again has to be run separately for each variable (1:13) and data subset (F1,F2,F3,M1,M2,M3). – mwalesp Apr 05 '23 at 19:16
  • Could you make a small reproducible example? It's hard to help when we can't run any of your code and therefore can't test any solutions. Maybe you could replicate a smaller version of your structure using built-in or simulated data so that we could run it. [See this FAQ](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more advice on creating nice reproducible examples in R. – Gregor Thomas Apr 05 '23 at 19:18
  • 1
    Thank you for your suggestion! I've updated with a small reproducible example of the data structure. – mwalesp Apr 05 '23 at 20:17

0 Answers0