R - how to do regression y~x for different id?

Question

say I have a dataframe with column id, x and y:

df <- data.frame(id = c("A","A","A","B","B","B","B","B","C","C","C","C","C","D","D",'D'),
                 y = c(1,3,5,4,3,4,6,8,1,4,7,10,2,5,6,8),
                 x = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4))

How could I do regression y~x for each different id?

I see a similar question here.

But is there a simple way to just do what I need here?

score 0 · Answer 1 · answered Feb 15 '22 at 13:14

How about this. If you want just the model coefficients, you could use sapply() to make a matrix of results. Otherwise lapply() (or sapply() too) could be used to make a list of the models.

df <- data.frame(id = c("A","A","A","B","B","B","B","B","C","C","C","C","C","D","D",'D'),
                 y = c(1,3,5,4,3,4,6,8,1,4,7,10,2,5,6,8),
                 x = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4))
coefs <- sapply(unique(df$id), function(i)lm(y ~ x, data=subset(df, id == i))$coef)
coefs
#>              A        B         C        D
#> (Intercept) -1 2.117647 -1.411765 1.833333
#> x            2 1.029412  2.823529 1.500000

mods <- lapply(unique(df$id), function(i)lm(y ~ x, data=subset(df, id == i)))
mods
#> [[1]]
#> 
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#> 
#> Coefficients:
#> (Intercept)            x  
#>          -1            2  
#> 
#> 
#> [[2]]
#> 
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#> 
#> Coefficients:
#> (Intercept)            x  
#>       2.118        1.029  
#> 
#> 
#> [[3]]
#> 
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#> 
#> Coefficients:
#> (Intercept)            x  
#>      -1.412        2.824  
#> 
#> 
#> [[4]]
#> 
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#> 
#> Coefficients:
#> (Intercept)            x  
#>       1.833        1.500

^{Created on 2022-02-15 by the reprex package (v2.0.1)}

score 0 · Answer 2 · answered Feb 15 '22 at 13:15

library(tidyverse)

df <- data.frame(
  id = c("A", "A", "A", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "D", "D", "D"),
  y = c(1, 3, 5, 4, 3, 4, 6, 8, 1, 4, 7, 10, 2, 5, 6, 8),
  x = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)
)
df %>%
  nest(-id) %>%
  mutate(model = data %>% map(~ lm(y ~ x, data = .x)))

score 0 · Answer 3 · answered Feb 15 '22 at 13:45

I got a warning message about an essentially perfect fit when I use your data set, but with a slightly different data frame, this could work.

You can use by(), define the data set df, the variable for identifiers df$idand the function you want to be carried out, summary(lm(y ~ x, data = df)).

df <- data.frame(id = rep(letters[1:3], 10),
                 y = rnorm(30),
                 x = rnorm(30))
by(df, df$id, function(df) summary(lm(y ~ x, data = df)))

R - how to do regression y~x for different id?

3 Answers3