Standardizing regression coefficients changed significance

Question

I originally had this formula: lm(PopDif ~ RailDensityDif + Ports + Coast, data = Pop) and got a coefficient of 1,419,000 for RailDensityDif, -0.1011 for ports, and 3418 for Coast. After scaling the variables: lm(scale(PopDif) ~ scale(RailDensityDif) + scale(Ports) + scale(Coast), data = Pop), my coefficient for RailDensityDif is 0.02107 and 0.2221 for Coast, so now Coast is more significant than RailDensityDif. I know scaling isn't supposed to change the significance—why did this happen?

*"so now Coast is more significant than RailDensityDif"* What is this statement based on? All you're reporting are parameter *estimates*; please make your post reproducible by including (1) sample data, and (2) code to reproduce parameter estimates for both models. Perhaps useful in this context (and for future posts) is advice on how to provide a [minimal reproducible example/attempt](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — Maurits Evers, Feb 01 '19 at 07:55
Also relevant is the following post that explains how p-values characterising the statistical significance of parameters in a linear model can in fact change when standardising (scaling) variables: [Standardized estimates give different p-value with a glmer/lmer](https://stats.stackexchange.com/questions/180904/standardized-estimates-give-different-p-value-with-a-glmer-lmer) — Maurits Evers, Feb 01 '19 at 07:58
To expand on my comments above please take a look at my post below for a practical example (based on `mtcars`). — Maurits Evers, Feb 01 '19 at 10:25
This is too old to migrate, but really belongs on CrossValidated. — Ben Bolker, Oct 12 '19 at 16:59

score 1 · Accepted Answer · answered Feb 01 '19 at 10:24

tldr; The p-values characterising the statistical significance of parameters in a linear model may change following scaling (standardising) variables.

As an example, I will work with the mtcars dataset, and regress mpg on disp and drat; or in R's formula language mpg ~ disp + drat.

1. Three linear models

We implement three different (OLS) linear models, the difference being different scaling strategies of the variables.

To start, we don't do any scaling.

m1 <- lm(mpg ~ disp + drat, data = mtcars)

Next, we scale values using scale which by default does two things: (1) It centers values at 0 by subtracting the mean, and (2) it scales values to have unit variance by dividing the (centered) values by their standard deviation.
```
m2 <- lm(mpg ~ disp + drat, data = as.data.frame(scale(mtcars)))
```
Note that we can apply scale to the data.frame directly, which will scale values by column. scale returns a matrix so we need to transform the resulting object back to a data.frame.
Finally, we scale values using scale without centering, but scaling values to have unit variance
```
m3 <- lm(mpg ~ disp + drat, data = as.data.frame(scale(mtcars, center = F)))
```

2. Comparison of parameter estimates and statistical significance

Let's inspect the parameter estimates for m1

summary(m1)$coef
#               Estimate  Std. Error   t value     Pr(>|t|)
#(Intercept) 21.84487993 6.747971087  3.237252 3.016655e-03
#disp        -0.03569388 0.006652672 -5.365345 9.191388e-06
#drat         1.80202739 1.542091386  1.168561 2.520974e-01

We get the t values from the ratio of the parameter estimates and standard errors; the p-values then follow from the area under the curve of the pdf for df = nrow(mtcars) - 3 (as we have 3 parameters) where x > |t| (corresponding to a two-sided t-test). So for example, for disp we confirm the t value

summary(m1)$coef["disp", "Estimate"] / summary(m1)$coef["disp", "Std. Error"]
#[1] -5.365345

and the p-value

2 * pt(summary(m1)$coef["disp", "Estimate"] / summary(m1)$coef["disp", "Std. Error"], nrow(mtcars) - 3)
#[1] 9.191388e-06

Let's take a look at results from m2:

summary(m2)$coef
#                 Estimate Std. Error       t value     Pr(>|t|)
#(Intercept) -1.306994e-17 0.09479281 -1.378790e-16 1.000000e+00
#disp        -7.340121e-01 0.13680614 -5.365345e+00 9.191388e-06
#drat         1.598663e-01 0.13680614  1.168561e+00 2.520974e-01

Notice how the t values (i.e. the ratios of the estimates and standard errors) are different compared to those of m1, due to the centering and scaling of data to have unit variance.

If however, we don't center values and only scale them to have unit variance

summary(m3)$coef
#              Estimate Std. Error   t value     Pr(>|t|)
#(Intercept)  1.0263872 0.31705513  3.237252 3.016655e-03
#disp        -0.4446985 0.08288348 -5.365345 9.191388e-06
#drat         0.3126834 0.26757994  1.168561 2.520974e-01

we can see that while estimates and standard errors are different compared to (unscaled) results from m1, their respective ratios (i.e. the t values) are identical. So (default) scale(...) will change the statistical significance of parameter estimates while scale(..., center = FALSE) will not.

It's easy to see why dividing values by their standard deviation does not change the ratio of OLS parameter estimates and standard errors when taking a look at the closed form for the OLS parameter estimate and standard error, see e.g. here.

Note that p-value **only changes for the intercept** in this case. Centring will only change p-values for coefficients other than the intercept if model contains interactions. — Ben Bolker, Oct 12 '19 at 17:06

Standardizing regression coefficients changed significance

1 Answers1

1. Three linear models

2. Comparison of parameter estimates and statistical significance