This is more of a general question. If I make a lm()
model of one independent and one dependent variable, the r-squared produced in the summary table is different from that I produce in a correlation table with the exact same variables. Howcome?
Asked
Active
Viewed 46 times
-1

hmnoidk
- 545
- 6
- 20
-
1This seems like a general statistics question. Such questions should be asked at [stats.se]. Otherwise specific programming questions for Stack Overflow should have a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and the output that you are seeing to make it possible to exactly describe what's going on. – MrFlick Sep 26 '19 at 14:43
-
But in general the R^2 value you get from a regression is the [coefficient of determination](https://en.wikipedia.org/wiki/Coefficient_of_determination) which is the correlation between your observed response and predicted response. It is not the correlation of the response and dependent variable (that wouldn't make sense when you have more than one dependent variable). – MrFlick Sep 26 '19 at 14:46
-
1Because correlation=R, not R^2. – Aaron left Stack Overflow Sep 26 '19 at 14:49
1 Answers
1
Please be more specific in your questions. We don't know what correlation calculation is being referred to since there are absolutely no details or example in the question. If the correlation is calculated correctly then its square is indeed equal to the R^2.
fm <- lm(demand ~ Time, BOD)
summary(fm)$r.squared
## [1] 0.6449202
cor(BOD$demand, BOD$Time)^2
## [1] 0.6449202
cor(fitted(fm), BOD$demand)^2
## [1] 0.6449202
cor(fitted(fm), fitted(fm) + resid(fm))^2
## [1] 0.6449202
The above is for one independent variable but we can extend this to more:
fm2 <- lm(cyl ~., mtcars)
summary(fm2)$r.squared
## [1] 0.9349896
cor(fitted(fm2), mtcars$cyl)^2
## [1] 0.9349896
cor(fitted(fm2), fitted(fm2) + resid(fm2))^2
## [1] 0.9349896

G. Grothendieck
- 254,981
- 17
- 203
- 341