1

is there a way to make all variables a target variable and check regression results against other independent variables. For example

df
Date         Var1       Var2    Var3
27/9/2019     12         45      59
28/9/2019     34         43      54
29/9/2019     45         23      40

Usually if want to see the relationship between Var1 and Var2 i use below code

lm(Var1 ~ Var2, data = myData)

In case I want to see the results for all variables (Var1 , Var2 and Var3) like, in one instance, Var1 is dependent variable and rest(Var2 and Var3) are independent. Then 2 instance, Var2 is dependent variable and rest(Var1 and Var3) are independent and so on. Is there a way to do this?

Cettt
  • 11,460
  • 7
  • 35
  • 58
Dev P
  • 449
  • 3
  • 12
  • Your dependent variable is the variable whose response you are modeling against changes in your independent variable(s). You probably shouldn't just change them around. The dependent and independent variable have specific definitions and you should know which variable falls under which definition. if you want to know the correlation between the variables you could use `cor(Var1,Var2)` to look at variables individually or `cor(df[,2:4])` to look at them in a matrix. – NoobR Oct 13 '19 at 14:47
  • 1
    Related: [1](https://stackoverflow.com/questions/28606549/how-to-run-lm-models-using-all-possible-combinations-of-several-variables-and-a), [2](https://stackoverflow.com/questions/22955617/linear-models-in-r-with-different-combinations-of-variables), [3](https://stackoverflow.com/questions/45270580/logistic-regression-how-to-try-every-combination-of-predictors-in-r). – Rui Barradas Oct 13 '19 at 14:59

3 Answers3

2

You could use something like this to get the formulas you need:

vars <- names(df)[-1] # we can eliminate the dates

forms <- lapply(1:length(vars),
       function(i) formula(paste(vars[i], "~", paste(vars[-i], collapse = "+")))
       )

Output:

[[1]]
Var1 ~ Var2 + Var3
<environment: 0x7fdaaa63abd0>

[[2]]
Var2 ~ Var1 + Var3
<environment: 0x7fdaaa63c508>

[[3]]
Var3 ~ Var1 + Var2
<environment: 0x7fdaaec0d2a8>

Then you just need to pass each formula into lm in lapply:

mods <- lapply(forms, lm, data = df)

Output:

[[1]]

Call:
FUN(formula = X[[i]], data = ..1)

Coefficients:
(Intercept)         Var2         Var3  
    196.403        3.514       -5.806  


[[2]]

Call:
FUN(formula = X[[i]], data = ..1)

Coefficients:
(Intercept)         Var1         Var3  
   -55.8933       0.2846       1.6522  


[[3]]

Call:
FUN(formula = X[[i]], data = ..1)

Coefficients:
(Intercept)         Var1         Var2  
    33.8301      -0.1722       0.6053 
0

If you want to regress Var1 against all other variables you can do the following :

 lm(Var1 ~. , data = myData)

If you just want to select more tab one variable than you can also use:

lm(Var1 ~ Var2 + Var3, data = myData)
Cettt
  • 11,460
  • 7
  • 35
  • 58
  • thanks. I agree. But if I want to regress Var 2 and Var 3? I think we should loop it right? Wanted to check if there is another way? – Dev P Oct 13 '19 at 14:38
0

The following is based on the answers to these questions: 1, 2 and 3. See the explanations therein.

The main difference is that it loops (lapply) through the columns of the input data set and constructs full models with each of those column-vectors as response and all others as predictors. Then dredges the full model fit.

library(MuMIn)

model_list <- lapply(names(df1), function(resp){
  fmla <- as.formula(paste(resp, "~ ."))
  print(fmla)
  full <- lm(fmla, data = df1, na.action = na.fail)
  dredge(full)
})

model_list

Test data creation code.

set.seed(1234)
df1 <- replicate(3, sample(10:99, 100, TRUE))
df1 <- as.data.frame(df1)
names(df1) <- paste0("Var", 1:3)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66