4

I'm fitting a linear model using OLS and have scaled my regressors with the function scale in R because of the different units of measure between variables. Then, I fit the model using the lm command and get the coefficients of the fitted model. As far as I know the coefficients of the fitted model are not in the same units of the original regressors variables and therefore must be scaled back before they can be interpreted. I have been searching for a direct way to do it by couldn't find anything. Does anyone know how to do that?

Please have a look to the code, could you please help me implementing what you proposed?

library(zoo)
filename="DataReg4.csv"
filepath=paste("C:/Reg/",filename, sep="")
separator=";"
readfile=read.zoo(filepath, sep=separator, header=T, format = "%m/%d/%Y", dec=".")
readfile=as.data.frame(readfile)
str(readfile)
DF=readfile
DF=as.data.frame(scale(DF)) 
fm=lm(USD_EUR~diff_int+GDP_US+Net.exports.Eur,data=DF)
summary(fm)
plot(fm)

I'm sorry this is the data.

http://www.mediafire.com/?hmcp7urt0ag8187

DirtStats
  • 559
  • 9
  • 29
nopeva
  • 1,583
  • 5
  • 22
  • 38
  • As far as I know, I don't think you need to scale them before you fit the linear model. Also can you give some reproducible example in case I understand wrong? – liuminzhao Jan 24 '13 at 20:45
  • Hi, you're right in general but I learnt that is a good practice when there are signs of multicollinearity or the units of measurements are considerably different between the regressor variables. I'm working in a toy example where I use the USD.EUR as a response and the GDP, exports, imports, etc as regressors. There is a high degree of multicollinearity in the fitted model. Many different techniques might be tested but right now I would like to check this one. – nopeva Jan 24 '13 at 20:51
  • 1
    @liuminzhao You don't but if you want to compare the effect sizes to help with interpretation of the model, importance of variables etc, then if those variables are measured on different scales, the standardising them is one way to achieve a common scale for comparison. – Gavin Simpson Jan 24 '13 at 20:52
  • @user1228124 and Gavin. Thanks for advices. Learnt a lot. – liuminzhao Jan 24 '13 at 21:09

3 Answers3

11

If you used the scale function with default arguments then your regressors will be centered (subtracting their mean) and divided by their standard deviations. You can interpret the coefficients without transforming them back to the original units:

Holding everything else constant, on average, a one standard deviation change in one of the regressors is associated with a change in the dependent variable corresponding to the coefficient of that regressor.

If you have included an intercept term in your model keep in mind that the interpretation of the intercept will change. The estimated intercept now represents the average level of the dependent variable when all of the regressors are at their average levels. This is a result of subtracting the mean from each variable.

To interpret the coefficients in non-standard deviation terms, just calculate the standard deviation of each regressor and multiple that by the coefficient.

Dan Gerlanc
  • 417
  • 2
  • 8
4

To de-scale or back-transform regression coefficients from a regression done with scaled predictor variable(s) and non-scaled response variable the intercept and slope should be calculated as:

A = As - Bs*Xmean/sdx
B = Bs/sdx

thus the regression is,

Y = As - Bs*Xmean/sdx + Bs/sdx * X

where

As = intercept from the scaled regression
Bs = slope from the scaled regression
Xmean = the mean of the scaled predictor variable
sdx = the standard deviation of the predictor variable

This can be adjusted if Y was also scaled but it appears you decided not to do that ultimately with your dataset.

DirtStats
  • 559
  • 9
  • 29
  • 1
    This seems to be correct, the other two are wrong (I think) - you should indeed divide the scaled coefficients by the scaling factor. The slope term is `Bs/s*X = Bs*(X/s)` – Ben Bolker Oct 18 '21 at 17:08
2

If I understand your description (that is unfortunately at the moment code-free), you are getting standardized regression coefficients for Y ~ As + Bs*Xs where all those "s" items are scaled variables. The coefficients then are the predicted change on a std deviation scale of Y associated with a change in X of one standard deviation of X. The scale function would have recorded the means and standard deviations in attributes for hte scaled object. If not, then you will have those estimates somewhere in your console log. The estimated change in dY for a change dX in X should be: dY*(1/sdY) = Bs*dX*(1/sdX). Predictions should be something along these lines:

Yest = As*(sdX) + Xmn + Bs*(Xs)*(sdX)

You probably should not have needed to standardize the Y values, and I'm hoping that you didn't because it makes dealing with the adjustment for the means of the X's easier. Put some code and example data in if you want implemented and checked answers. I think @DanielGerlance is correct in saying to multiply rather than divide by the SD's.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks for your answer I have posted my code. It is not clear to me whether multiplying the coefficients by their sample standard deviation is enough to scale the coefficients back. I'm using an intercept but the result is estimate 0 and p-value 1 can you help me interpreting that? – nopeva Jan 25 '13 at 01:58
  • The intercept in such a model (where you scaled the Y as well as the X's) should be zero since you subtracted means from everything in the scaling process. That's why I added back the means in my revised expression. (You also need to add back the Y_mean.) The Wald statistic is for a test of theIntercept being zero, and it is, so it makes sense that the p-value would be 1. That's the code but not the data. – IRTFM Jan 25 '13 at 02:05
  • Thanks I have finally scaled the X's only and multiplied the coefficients by the sds. – nopeva Jan 27 '13 at 09:23
  • 1
    I was using this answer to back-transform/descale a regression and I don't believe the formula in this answer it quite right. See my new answer for a different solution. – DirtStats Oct 30 '17 at 16:16