I would like to loop through the independent variables and regress them on the dependent variable with data.table. Because of the huge size of my dataset I need an efficient solution. I have found this suggestion with the mtcars
dataframe as an example:
library(data.table)
Fits <- as.data.table(mtcars)[, list(MyFits = lapply(.SD[, -1, with = F], function(x) summary(lm(mpg ~ x))))]
I tried it first on a few of my own datasets without much success. I then tried to apply it to mtcars itself, giving the following unexpected result: 10 rows of the variable MyFits each looking like the example below.
list(call = lm(formula = mpg ~ x), terms = mpg ~ x, residuals = c(0.370164348925326, 0.370164348925409, -3.58141592920354, 0.770164348925413, 3.82174462705436, -2.52983565107458, -0.578255372945635, -1.98141592920354, -3.58141592920354, -1.42983565107459, -2.82983565107459, 1.52174462705436, 2.42174462705436, 0.321744627054363, -4.47825537294564, -4.47825537294564, -0.178255372945637, 6.01858407079646, 4.01858407079646, 7.51858407079646, -4.88141592920354, 0.621744627054364, 0.321744627054363, -1.57825537294564, 4.32174462705436, 0.918584070796464, -0.381415929203536, 4.01858407079646, 0.921744627054365, -0.929835651074587, 0.121744627054364, -4.98141592920354), coefficients = c(37.8845764854614, -2.87579013906447, 2.07384360552423, 0.322408882659104, 18.2678078445963, -8.91969884745751, 8.36915530493018e-18, 6.11268714258098e-10), aliased = c(FALSE, FALSE), sigma = 3.20590203190608, df = c(2, 30, 2), r.squared = 0.726180005093805, adj.r.squared = 0.717052671930265, fstatistic = c(79.5610275293349, 1, 30 ), cov.unscaled = c(0.418457648546144, -0.0625790139064475, -0.0625790139064475, 0.0101137800252844)
)
The author of the answer Linear Regression loop for each independent variable individually against dependent already mentioned the answer was in need of an update, but I am not figuring out what is going wrong.
Any suggestions?