How to calculate linear models individually in melted Dataframe

Question

I have currently a problem where I have a dataframe similar to the one I called "test" below. What I would like to do is to fit a Linear Model for each Site against Time and Group, so one model for A, one for B, and one for C.
e.g: Site A is present in 2 Groups: G1 and G2. It was measured at 5 time points. So I do have 5 values which should be modelled as dependent from Time (value ~ Time) and because it was done in 2 conditions (Group) this should be integrated so: (value ~ Time*Group).

How can I most efficiently achieve this and then extract the information from the summary to store them in a vector or list?

Thank you for your time, I really appreciate it.

test <- data.frame(Site= rep(c( rep("A", 5),
                                rep("B", 5),
                                rep("C", 5)),2),
                   
                    value= c(rnorm(1, n=15), rnorm(1, n=15)),
                    Time= rep(rep((1:5), 3), 2),
                    Group= c(rep("G1", 15), rep("G2", 15))
                    )

# Loop ?
linReg <- lm(value ~ Time * Group, data= test)

this has **got** to be a duplicate ... `lme4::lmList(value~Time*Group|Site, data=test)` — Ben Bolker, Oct 01 '20 at 21:23
Thank you for linking me the original solution I might have missed it — Lukas, Oct 01 '20 at 21:27

andrew_reece · Accepted Answer · 2020-10-01T21:17:51.917

Use group_split by Site and then map with lm():

library(tidyverse)

test %>%
  group_split(Site) %>%
  map(~lm(value ~ Time * Group, data = .))

Output:

[[1]]

Call:
lm(formula = value ~ Time * Group, data = .)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
     -0.6393        0.5201        3.6533       -1.2188  


[[2]]

Call:
lm(formula = value ~ Time * Group, data = .)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
    -0.38982       0.24745       0.58777      -0.08554  


[[3]]

Call:
lm(formula = value ~ Time * Group, data = .)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
     0.17921       0.02528       2.13208      -0.34299

Add %>% summary() or whatever other post-fitting processes you want, within the call to map():

map(~lm(value ~ Time * Group, data = .) %>% summary())

score 1 · Answer 2 · answered Oct 01 '20 at 21:21

A base R solution can be implemented with split() and lapply() as follows:

test <- data.frame(Site= rep(c( rep("A", 5),
                                rep("B", 5),
                                rep("C", 5)),2),
                   
                   value= c(rnorm(1, n=15), rnorm(1, n=15)),
                   Time= rep(rep((1:5), 3), 2),
                   Group= c(rep("G1", 15), rep("G2", 15))
)

models <- lapply(split(test,test$Site),function(x){
     lm(value ~ Time * Group, data = x)
})

models

...and the output:

$A

Call:
lm(formula = value ~ Time * Group, data = x)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
      2.6466       -0.2999       -3.0912        0.7022  


$B

Call:
lm(formula = value ~ Time * Group, data = x)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
      1.4547       -0.2859       -0.8216        0.5031  


$C

Call:
lm(formula = value ~ Time * Group, data = x)

Coefficients:
 (Intercept)          Time       GroupG2  Time:GroupG2  
     1.50226      -0.12825      -0.91705      -0.01143

How to calculate linear models individually in melted Dataframe

2 Answers2