1

I am trying to run a regression in a loop with variables names changing at each loop. Similar to this setup here

At the end, I would like to save the fitted results in a list.

My code is the following:

year <- rep(2014:2015, length.out = 10000)
group <- sample(c(0,1,2,3,4,5,6), replace=TRUE, size=10000)
value <- sample(10000, replace = T)
female <- sample(c(0,1), replace=TRUE, size=10000)
smoker <- sample(c(0,1), replace=TRUE, size=10000)

dta <- data.frame(year = year, group = group, value = value, female=female, smoker = smoker)

pc<- dta[, c("female", "smoker")]
names_pc <- names(pc)

m_fit <- vector("list", length(names_pc))

for (i in seq_along(names_pc)){
  m <- lm(value ~ year + group + group:names_pc[i], data = dta)
  m_fit[[i]] <- m$fit
}

... but something is wrong. I get the following error message.

Error in model.frame.default(formula = value ~ year + group + group:names_pc[i],  : 
  invalid type (NULL) for variable 'names_pc[i]'
Stata_user
  • 562
  • 3
  • 14

2 Answers2

2

Construct the formula using sprintf/paste0 :

m_fit <- vector("list", length(names_pc))

for (i in seq_along(names_pc)){
  m <- lm(sprintf('value ~ year + group + group:%s', names_pc[i]), data = dta)
  m_fit[[i]] <- m$fit
}
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

using tidyverse

model

m <- map(dta[names_pc], ~lm(value ~ year + group + group:.x, data = dta))

fit

m_fit <- map(m, ~.x$fit)

Yuriy Saraykin
  • 8,390
  • 1
  • 7
  • 14