3

I was wondering if there is any command that can output the results of a lm model into a data frame in R like outest in SAS. Any ideas? I am running multiple models and I want the result to look like below -

Model  |  alpha   | Beta | Rsquared | F |  df |
model0 |  8.4     | ...  | ....     | ..|  .. |
model1 |  ...     | ...  | ....     | ..|  .. |
model2 |  ...     | ...  | ....     | ..|  .. |

The data i have is 'ds' which is -

X1 | X2 | Y1 |
.. | .. | .. |
.. | .. | .. |
.. | .. | .. |
.. | .. | .. |

And my code is a simple lm code -

model0 <- lm(Y1 ~ X1, ds)
model1 <- lm(Y1 ~ 1, ds)
model2 <- lm(Y1 ~ X1 + X2, ds)
RHelp
  • 815
  • 2
  • 8
  • 23

2 Answers2

4

I do exactly the same thing. The difficulty here is of course if the models have different number of coefficients - then you would have different number of columns, which is impossible in data.frame. You need to have the same number of columns for each model.

I normally use it for glm (these code snippets are commented out) but I modified it for lm for you:

models <- c()

for (i in 1:10) {

    y <- rnorm(100) # generate some example data for lm
    x <- rnorm(100)
    m <- lm(y ~ x)

    # in case of glm:
    #m <- glm(y ~ x, data = data, family = "quasipoisson")
    #overdispersion <- 1/m$df.residual*sum((data$count-fitted(m))^2/fitted(m))

    coef <- summary(m)$coef
    v.coef <- c(t(coef))
    names(v.coef) <- paste(rep(rownames(coef), each = 4), c("coef", "stderr", "t", "p-value"))
    v.model_info <- c(r.squared = summary(m)$r.squared, F = summary(m)$fstatistic[1], df.res = summary(m)$df[2])

    # in case of glm:
    #v.model_info <- c(overdisp = summary(m)$dispersion, res.deviance = m$deviance, df.res = m$df.residual, null.deviance = m$null.deviance, df.null = m$df.null)

    v.all <- c(v.coef, v.model_info)    
    models <- rbind(models, cbind(data.frame(model = paste("model", i, sep = "")), t(v.all)))

}

I prefer to take data from summary(m). To bundle the data into data.frame, you use the cbind (column bind) and rbind (row bind) functions.

Tomas
  • 57,621
  • 49
  • 238
  • 373
2

You can use the coefficients function:

out = coefficients(lm(mpg ~ wt, mtcars))
out
# (Intercept)          wt 
#   37.285126   -5.344472 
out[1]
# (Intercept) 
#    37.28513 

or for the group of lm objects:

library(plyr)
out = ldply(list(model0, model1, model2), coefficients)
rownames(out) = sprintf('model%d', 0:2)
       (Intercept)        wt
model0    37.28513 -5.344472
model1    37.28513 -5.344472
model2    37.28513 -5.344472

To expand my solution to what you need, you need to:

  1. Find out how to extract the other information you need from an lm object.
  2. Write a custom function which returns a one-row data.frame which contains all the information.
  3. Run it using the ldply syntax I showed.
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • I should have elaborated. I am running 5 models and I want the results as mentioned in the table above. My Bad. – RHelp Dec 30 '13 at 12:15
  • Please expand your question, including test code and test data. – Paul Hiemstra Dec 30 '13 at 12:19
  • I have edited my question. Sorry for the confusion Does this help? – RHelp Dec 30 '13 at 12:26
  • I would prefer code that I can simply copy-paste into R, see http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. I added some details you can use to answer your question. – Paul Hiemstra Dec 30 '13 at 12:34