1

I am using lm() on a large data set in R. Using summary() one can get lot of details about linear regression between these two parameters.

The part I am confused with is which one is the correct parameter in the Coefficients: section of summary, to use as correlation coefficient?

Sample Data

c1 <- c(1:10)
c2 <- c(10:19)
output <- summary(lm(c1 ~ c2))

Summary

Call:
lm(formula = c1 ~ c2)

Residuals:
      Min         1Q     Median         3Q        Max 
-2.280e-15 -8.925e-16 -2.144e-16  4.221e-16  4.051e-15 

Coefficients:
             Estimate Std. Error    t value Pr(>|t|)    
(Intercept) -9.000e+00  2.902e-15 -3.101e+15   <2e-16 ***
c2           1.000e+00  1.963e-16  5.093e+15   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.783e-15 on 8 degrees of freedom
Multiple R-squared:      1, Adjusted R-squared:      1 
F-statistic: 2.594e+31 on 1 and 8 DF,  p-value: < 2.2e-16

Is this the correlation coefficient I should use?

output$coefficients[2,1]
1

Please suggest, thanks.

Chetan Arvind Patil
  • 854
  • 1
  • 11
  • 31
  • 1
    This is not correlation coefficient but parameter estimate. – Miha Aug 04 '17 at 21:27
  • @Miha - Which parameter to use then? Question [here](https://stackoverflow.com/questions/6577058/extract-regression-coefficient-values) was using specific parameter for extracting coefficient based on what was required by the OP. – Chetan Arvind Patil Aug 04 '17 at 21:31
  • So what is your desired output? Are you predicting the value of dependent value and you would like to extract the regression coeficient (estimates the change in the mean response per unit increase in dependent vriable)? If this is the case, than you already extracted this parameter by your code. – Miha Aug 04 '17 at 21:34
  • @Miha - Overall goal is to find the `correlation coefficient` between the two parameter in question. I am using `residual values` for `outlier` analysis, but just need to log `correlation coefficient` to see how parameters (stacked column wise) behave among each other. – Chetan Arvind Patil Aug 04 '17 at 21:38
  • Then you used the wrong code/test. To calculate corrleation coeficient between two parameters you should use: `cor.test(c1,c2,method="pearson")` which gives you correlation coefficient between two parametrs. In this case the results are the same (corrleation coefficient == to parameter estimates) but this is coincidence. – Miha Aug 04 '17 at 21:43
  • @Miha - Thanks. Are you sure that value won't match with what I am extracting with `output$coefficients[2,1]` for any unknown parameters? – Chetan Arvind Patil Aug 04 '17 at 21:48
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/151112/discussion-between-miha-and-chetan-arvind-patil). – Miha Aug 04 '17 at 21:49

1 Answers1

1

The full variance covariance matrix of the coefficient estimates is:

fm <- lm(c1 ~ c2)
vcov(fm)

and in particular sqrt(diag(vcov(fm))) equals coef(summary(fm))[, 2]

The corresponding correlation matrix is:

cov2cor(vcov(fm))

The correlation between the coefficient estimates is:

cov2cor(vcov(fm))[1, 2]
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341