Why do calculated residuals differ between R functions `lm()` and `lm.fit()`

Question

I'm trying to switch from lm() to the faster lm.fit() in order to calculate r² values from large matrices faster. (I don't think I can use cor(), per Function to calculate R2 (R-squared) in R, when x is a matrix.)

Why do lm() and lm.fit() calculate different fitted values and residuals?

set.seed(0)
x <- matrix(runif(50), 10)
y <- 1:10
lm(y ~ x)$residuals
lm.fit(x, y)$residuals

I wasn't able to penetrate the lm() source code to figure out what could be contributing to the difference...

`lm.fit(cbind(1,x), y)$residuals` – user20650 Apr 29 '21 at 09:53 — user20650, Apr 29 '21 at 09:53

score 1 · Accepted Answer · answered Apr 29 '21 at 10:02

From ?lm.fit x "should be design matrix of dimension n * p", where p is the number of coefficients. You therefore have to pass a vector of ones for the intercept to get the same model.

Thus estimating

lm.fit(cbind(1,x), y)

will give the same parameters as

lm(y ~ x)

Why do calculated residuals differ between R functions `lm()` and `lm.fit()`

1 Answers1