1) SO questions are supposed to provide the test data reproducibly but here we have done it for you using the builtin data.frame anscombe
. After defining the test data we define a data frame containing the columns we want and the appropriate formula. Finally we call lm
:
# test data
matrix1 <- as.matrix(anscombe[5:8])
matrix2 <- as.matrix(anscombe[1:4])
DF <- data.frame(matrix1[, 1, drop = FALSE], matrix2) # cols are y1, x1, x2, x3, x4
fo <- sprintf("%s ~ (.)^%d", colnames(matrix1)[1], ncol(matrix2)) # "y1 ~ (.)^4"
lm(fo, DF)
giving:
Call:
lm(formula = fo, data = DF)
Coefficients:
(Intercept) x1 x2 x3 x4 x1:x2
12.8199 -2.6037 NA NA -0.1626 0.3628
x1:x3 x1:x4 x2:x3 x2:x4 x3:x4 x1:x2:x3
NA NA NA NA NA -0.0134
x1:x2:x4 x1:x3:x4 x2:x3:x4 x1:x2:x3:x4
NA NA NA NA
2) A variation of this which gives a slightly nicer result in the Call:
part of the lm
output is the following. We use DF
from above. do.call
will pass the contents of the fo
variable rather than its name so that we see the formula in the Call:
part of the output. On the other hand, quote(DF)
is used to force the name DF
to display rather than the contents of the data.frame.
lhs <- colnames(matrix1)[1]
rhs <- paste(colnames(matrix2), collapse = "*")
fo <- paste(lhs, rhs, sep = "~") # "y1~x1*x2*x3*x4"
do.call("lm", list(fo, quote(DF)))
giving:
Call:
lm(formula = "y1 ~ x1*x2*x3*x4", data = DF)
Coefficients:
(Intercept) x1 x2 x3 x4 x1:x2
12.8199 -2.6037 NA NA -0.1626 0.3628
x1:x3 x2:x3 x1:x4 x2:x4 x3:x4 x1:x2:x3
NA NA NA NA NA -0.0134
x1:x2:x4 x1:x3:x4 x2:x3:x4 x1:x2:x3:x4
NA NA NA NA