1

I have a data set with 8 variables. I need all possible two way interaction terms along with the seven predictors in each model. So, in my case there will be total 7C2 = 21 models, each of them containing the 7 predictors and a two way interaction term at a time.

I have tried to produce the 21 models using for loop but the code seems to fail at the lm() function when I try to use that inside the for loop. In my problem return is the response variable at the 5-th column of my data.

colnames(dt) = c("assets","turnover_ratio","SD","sharpe_ratio","return",
                 "expense_ratio","fund_dummy","risk_dummy")
vars=colnames(dt)[-5] 
for (i in vars)  {
  for (j in vars) {
    if (i != j) {
      factor= paste(i,j,sep='*')}
    lm.fit <- lm(paste("return ~", factor), data=dt)
    print(summary(lm.fit))
  }}

The error message is given below for the code:

Error in paste("return ~", factor) : cannot coerce type 'closure' to vector of type 'character'

This is my data set: data set

The output below should be the desired output and 20 more such models are needed with other possible two way interaction terms. All the 7 predictors should be present in each model. The only thing that should change is the two way interaction term.

This is my desired output among the 21 required: one desired output among the 21 required outputs

OTStats
  • 1,820
  • 1
  • 13
  • 22

3 Answers3

2

The following apply loop gets all pairwise interactions between the 7 variables. The 21 pairs are first obtained with combn.

vars <- colnames(dt)[-5] 
resp <- colnames(dt)[5] 

cmb <- combn(vars, 2)

lm_list <- apply(cmb, 2, function(regrs){
  inter_regrs <- paste(regrs, collapse = "*")
  other_regrs <- setdiff(vars, regrs)
  all_regrs <- paste(other_regrs, collapse = "+")
  all_regrs <- paste(all_regrs, inter_regrs, sep = "+")
  fmla <- as.formula(paste(resp, all_regrs, sep = "~"))
  lm(fmla, data = dt)
})

lapply(lm_list, summary)

Data creation code.

set.seed(1234)
dt <- replicate(8, rnorm(100))
dt <- as.data.frame(dt)

colnames(dt) <- c("assets","turnover_ratio","SD",
              "sharpe_ratio","return","expense_ratio",
              "fund_dummy","risk_dummy")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • How can I accommodate both the `+` and `:` term in the single line here in my case `rhs <- unlist(sapply(1:length(iv), function(m) apply(combn(iv, m = m), 2, paste, collapse = ' + '))) right now I'm either running first the `+` and the other one is `:` separately i would generate the formula where i have both `+` and `:` together in a formula .. ` – PesKchan Apr 28 '22 at 15:04
  • 1
    @PesKchan If you look at the first `paste`, the operator is `*`, this expands to both `+` and the interactions `:`. Is this it? – Rui Barradas Apr 28 '22 at 18:30
  • yes.i want know how do incorporate the same into my code ...that i tried but not able to do.. – PesKchan Apr 28 '22 at 18:31
  • 1
    @PesKchan OK, I believe you should post the comment as a question, with sample data and a more complete description of the problem. If you have looked up similar questions and answers such as this one, include a link to them explaining why it didn't solve your problem. – Rui Barradas Apr 28 '22 at 18:43
1

Your problem is the end of the if statement. This code should work:

colnames(dt) = c("assets","turnover_ratio","SD","sharpe_ratio","return",
                 "expense_ratio","fund_dummy","risk_dummy")
vars=colnames(dt)[-5] 
for (i in vars)  {
  for (j in vars) {
    if (i != j) {
      factor= paste(i,j,sep='*')
      lm.fit <- lm(paste0("return ~", factor), data=dt)
      print(summary(lm.fit))
    }
  }
}

The problem was that for the first iteration the variable factor was not define. Also try not to name a variable factor, since factor is a function in R.

Santiago I. Hurtado
  • 1,113
  • 1
  • 10
  • 23
1

I think this should work and allow you to get rid of the loops:

lm.fit = lm(return ~ (.)^2, data=dt)

michotross
  • 364
  • 3
  • 12
  • 1
    I thought this based on the title, but it seems OP wants to fit the models separately, using only a single interaction term at a time. – Gregor Thomas Oct 17 '19 at 16:18