I am currently trying to calculate several linear regressions with a large dataset in R (around 2 million observations) which have around 10 variables. Most of the variables are values between 0 and 100. If I try to run my regression with the function lm()
, it produces the following error:
Error: vector memory exhausted (limit reached?)
After scaling my variables with this function:
rescale(df$var1, to = c(0, 1))
R is suddenly able to calculate my regressions. Do you have any idea why R is only able to calculate my coefficients when I convert my variables with rescale
?
Furthermore, do you know any more efficient linear models I could use? I have also tried biglm()
but it caused the same error for me with the unscaled numbers.