0

I'm trying to switch from lm() to the faster lm.fit() in order to calculate r² values from large matrices faster. (I don't think I can use cor(), per Function to calculate R2 (R-squared) in R, when x is a matrix.)

Why do lm() and lm.fit() calculate different fitted values and residuals?

set.seed(0)
x <- matrix(runif(50), 10)
y <- 1:10
lm(y ~ x)$residuals
lm.fit(x, y)$residuals

I wasn't able to penetrate the lm() source code to figure out what could be contributing to the difference...

Martin Smith
  • 3,687
  • 1
  • 24
  • 51

1 Answers1

1

From ?lm.fit x "should be design matrix of dimension n * p", where p is the number of coefficients. You therefore have to pass a vector of ones for the intercept to get the same model.

Thus estimating

lm.fit(cbind(1,x), y)

will give the same parameters as

lm(y ~ x)
user20650
  • 24,654
  • 5
  • 56
  • 91