I'm actually trying to do some test on my linear regression model with different functions as ols_vif_tol()
, ols_test_normality()
or durbinWatsonTest()
which only work with lm()
. However, I got my model using the train()
function of the caret
package.
> fitcontrol = trainControl( method = "repeatedcv", number = floor(0.4*nrow(TrainData)), repeats = RepeatsTC, returnResamp = "all", savePredictions = "all")
> BestModel = train(Formula2, data = TrainData, trControl = fitcontrol, method = "lm", metric = "RMSE")
At the end I get this output:
> BestModel
Linear Regression
10 samples
1 predictor
No pre-processing
Resampling: Cross-Validated (4 fold, repeated 100 times)
Summary of sample sizes: 7, 8, 8, 7, 7, 8, ...
Resampling results:
RMSE Rsquared MAE
10.75823 0.8911761 9.660638
Tuning parameter 'intercept' was held constant at a value of TRUE
What I want is to have this output:
> GoodModel = lm(Formula2, data = FinalData)
> GoodModel
Call:
lm(formula = Formula2, data = FinalData)
Coefficients:
(Intercept) Evol.INDUS.PROD
4.089 3.908
So, even if I used method = "lm"
I don't have the same output which to give me an error when I do my tests.
> ols_test_normality(BestModel)
Error in ols_test_normality.default(BestModel) : y must be numeric
> ols_test_normality(GoodModel)
-----------------------------------------------
Test Statistic pvalue
-----------------------------------------------
Shapiro-Wilk 0.9042 0.1528
Kolmogorov-Smirnov 0.1904 0.6661
Cramer-von Mises 1.1026 0.0010
Anderson-Darling 0.4615 0.2156
-----------------------------------------------
I know there is a as.lm
function but I tried it and I don't have a version that can use it.
Does someone know how to get the same form as the lm()
function after using train
or a way to use the output of BestModel to do those tests?
EDIT
Here is a simpler case that gives rise to the same error and where you can try different tests.
install.packages("olsrr")
install.package("caret")
library(olsrr)
library(caret)
first = sample(1:10, 10, rep = TRUE)
second = sample(10:20, 10, rep = TRUE)
third = sample(20:30, 10, rep = TRUE)
Df = data.frame(first, second, third)
Df
#Create a model with lm
Model1 = lm(first ~ second + third, data = Df)
Model1
summary(Model1)
ols_test_normality(Model1)
#Create a model with caret::train
Fold = sample(1:nrow(Df) ,size = 0.8*nrow(Df), replace = FALSE)
TrainData = Df[Fold,]
TestData = Df[-Fold,]
fitcontrol = trainControl(method = "repeatedcv", number = 2, repeats = 10)
Model2 = train(first ~ second + third, data = TrainData, trControl = fitcontrol, method = "lm")
Model2
summary(Model2)
ols_test_normality(Model2)
Thank you