4

I have to perform multiple linear regression for many vectors of dependent variables on the same matrix of independent variables.

For example, I want to create 3 models such that:

lm( d ~ a + b + c )
lm( e ~ a + b + c )
lm( f ~ a + b + c )

from the following matrix (a,b,c are the independent variables and d,e,f are the dependent variables)

       [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,]    a1       b1       c1       d1       e1       f1
[2,]    a2       b2       c2       d2       e2       f2
[3,]    a3       b3       c3       d3       e3       f3

I then want to store the coefficients from the regression in another matrix (I have reduced the number of columns and vectors in my example for ease of explanation).

  • 1
    Please show an example that is [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Also, check [here](http://stackoverflow.com/questions/31377737/repeat-regression-with-varying-dependent-variable) – akrun Jul 13 '15 at 08:40
  • Sorry, I am a first time user. The link looks to be similar to what I am trying to do. Thanks – Christopher Baker Jul 13 '15 at 08:57
  • How is that other matrix suppose to look? There are no coefficients for the dependent variables. – Mike Wise Jul 13 '15 at 08:58
  • Thanks **Ken Benoit** , exactly what I needed. – Christopher Baker Jul 13 '15 at 12:05

2 Answers2

1

Here's a method that is not very general, but will work if you substitute your own dependent variable names in depvar, and of course the independent variables common to all models in the inner lm() call, and of course the dataset name. Here I have demonstrated on mtcars, a built-in dataset supplied with R.

depvar <- c("mpg", "disp", "qsec")
regresults <- lapply(depvar, function(dv) {
    tmplm <- lm(get(dv) ~ cyl + hp + wt, data = mtcars)
    coef(tmplm)
})
# returns a list, where each element is a vector of coefficients
# do.call(rbind, ) will paste them together
allresults <- data.frame(depvar = depvar, 
                         do.call(rbind, regresults))
# tidy up name of intercept variable
names(allresults)[2] <- "intercept"
allresults
##   depvar  intercept        cyl          hp        wt
## 1    mpg   38.75179 -0.9416168 -0.01803810 -3.166973
## 2   disp -179.04186 30.3212049  0.21555502 59.222023
## 3   qsec   19.76879 -0.5825700 -0.01881199  1.381334

Edit based on suggestion by @Mike Wise:

If you want only a numeric dataset but want to keep the identifier, you can add it as a row.name, like this:

allresults <- data.frame(do.call(rbind, regresults),
                         row.names = depvar)
# tidy up name of intercept variable
names(allresults)[1] <- "intercept"
allresults
##       intercept        cyl          hp        wt
## mpg    38.75179 -0.9416168 -0.01803810 -3.166973
## disp -179.04186 30.3212049  0.21555502 59.222023
## qsec   19.76879 -0.5825700 -0.01881199  1.381334
Ken Benoit
  • 14,454
  • 27
  • 50
  • 1
    This seems to me to be what he meant. I think the question needs to be editied and remove the dependent variable columns for this answer to match the question though. – Mike Wise Jul 13 '15 at 09:20
  • 1
    Thanks you are right - I added an alternative that returns a data.frame consisting only of coefficients. – Ken Benoit Jul 13 '15 at 09:29
0

I actually recently encountered the same issue and a quick and easy way to go about it is to simply manually add all the results to a dataframe with the coefficients function.

coeffdf <- data.frame(coefficients(lm1),coefficients(lm2))

It will work well if you have the same variables for each regression.

QMad
  • 9
  • 2