System is computationally singular due to small numbers in linearHypothesis

Question

Ok, so here is the code that demonstrate the problem I am referring to:

x1 <- c(0.001, 0.002, 0.003, 0.0003)
x2 <- c(15000893, 23034340, 3034300, 232332242)
x3 <- c(1,3,5,6)
y <- rnorm( 4 )
model=lm( y ~ x1 + x2 )
model2=lm( y ~ x1 + x3 )
type <- "hc0"
V <- hccm(model, type=type)
sumry <- summary(model)
table <- coef(sumry)
table[,2] <- sqrt(diag(V))
table[,3] <- table[,1]/table[,2]
table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
sumry$coefficients <- table
p <- nrow(table)
hyp <- cbind(0, diag(p - 1))
linearHypothesis(model, hyp, white.adjust=type)

Note that this is not caused by perfect multicollinearity. As you can see, I deliberately set the value of x2 to be very large and the value of x1 to be very small. When this happens, I cannot perform a linearHypothesis test of model=lm( y ~ x1 + x2 ) on all coefficients being 0: linearHypothesis(model, hyp, white.adjust=type). R will throw the following error:

> linearHypothesis(model, hyp, white.adjust=type)
Error in solve.default(vcov.hyp) : 
  system is computationally singular: reciprocal condition number = 2.31795e-23

However, when I use model2=lm( y ~ x1 + x3 ) instead, whose x3 is not too large compared to x1, the linearHypothesis test succeeds:

> linearHypothesis(model2, hyp, white.adjust=type)
Linear hypothesis test

Hypothesis:
x1 = 0
x3 = 0

Model 1: restricted model
Model 2: y ~ x1 + x3

Note: Coefficient covariance matrix supplied.

  Res.Df Df      F Pr(>F)
1      3                 
2      1  2 11.596 0.2033

I am aware that this might be caused by the fact that R cannot invert matrices whose numbers are smaller than a certain extent, in this case 2.31795e-23. However, is there a way to circumvent that? Is this the limitation in R or the underlying C++? What is the good practice here? The only method I can think of is to rescale the variables so that they are at the same scale. But I am also concerned about the amount of information I will lose by dividing everything by their standard errors.

In fact, I have 200 variables that are percentages, and 10 variables (including dependent variables) that are large (potentially to the 10^6 scale). It might be troubling to scale them one by one.

Why is scaling "troublesome"? Scaling variables is common and easy. Some people do it always. They don't even need to be the *same* scale, just not so vastly different. Use `x2_scaled = scale(x2)` instead of `x2`. — Gregor Thomas, Sep 18 '19 at 17:51
The error message states _**computationally** singular_ which may indicate a precision issue of R or your machine respectively.. — jay.sf, Sep 18 '19 at 17:54
@Gregor I thought it is troublesome because I have 200 variables that are small, and 10 variables that are large. — Jinhua Wang, Sep 18 '19 at 18:03
@Gregor Besides, I am also a little concerned about the information loss of dividing everything by standard errors. — Jinhua Wang, Sep 18 '19 at 18:04
So scale all of them. But no information needs to be lost - just save the SDs before you scale. You can transform the coefficients back to the original scale if you want. You can scale all the variables in a data frame with `df[] = lapply(df, scale)`. You can also scale just some of them with `vars_to_scale = c("var1", "var8", ...); df[vars_to_scale] = lapply(df[vars_to_scale])` — Gregor Thomas, Sep 18 '19 at 18:20
@Gregor I don't think it is that easy to recover the coefficients - because scaling usually involves demeaning and dividing by standard deviation. See this post: https://stats.stackexchange.com/questions/189652/is-it-a-good-practice-to-always-scale-normalize-data-for-machine-learning — Jinhua Wang, Sep 18 '19 at 19:23
For nonlinear methods, e.g., SVM, you are losing useful information. [For ordinary linear regression, recovering the coefficients is not difficult](https://stackoverflow.com/a/14512703/903061). The centering (subtracting the mean) is accounted for in the intercept, and the the scaling (dividing by the SD) can be compensated for too. Try it! `m1 = lm(mpg ~ wt + drat, data = mtcars); m2 = lm(mpg ~ wt + scale(drat), data = mtcars)`. You'll see that multiplying the `drat` coefficient from `m1` by `sd(mtcars$drat)` gets you the `scale(drat)` coef in `m2`. The `wt` coefficient is unchanged. — Gregor Thomas, Sep 18 '19 at 19:38
@Gregor Looks like scaling could affect linearHypothesis test though: https://stats.stackexchange.com/questions/340544/linear-hypothesis-tests-with-unstandardized-and-standardized-variables-why-are — Jinhua Wang, Sep 18 '19 at 19:59

System is computationally singular due to small numbers in linearHypothesis

0 Answers0