I am trying to evaluate and stack learning algorithms using Time Series Cross Validation using caret and caretEnsemble. caret directly supports this via method="timeslice" - however, caret ensemble does not
caretList does not currently know how to handle cross-validation method='timeslice'. Please specify trControl$index manually
and it seems to be possible to create the indices manually using
createTimeSlices
however, this yields the following error in the minimal working example below
Error: Stopping In addition: Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures.
library(tidyverse)
library(caret)
library(caretEnsemble)
dfml=data.frame(date=seq(as.Date("2014/09/04"), by = "day", length.out = 1090),y=rnorm(1090),x=rnorm(1090))
index=createTimeSlices(dfml$y,365,horizon = 1,fixedWindow = TRUE)
n=726
seeds <- vector(mode = "list", length = n) # creates an empty vector containing lists
for(i in 1:(n-1)){seeds[[i]] <- sample.int(1000, 3) }
seeds[[n]] <- sample.int(1000, 1)
myIndexControl <- trainControl(method = "cv",
allowParallel = TRUE,
seeds = seeds,index=index$train,indexOut=index$test)
alg_list <- c("glmnet", "gbm", "lm")
multi_mod <- caretList(y ~ . ,
data = dfml,
trControl = myIndexControl,
methodList = alg_list,
family="gaussian",
metric = "RMSE", seeds=seeds)
Any suggestion or workaround would be greatly appreciated. A single model can be tuned this way
glmnet.fit =train( y~ .,
data = df,
method = "glmnet",
verbose=FALSE, trControl = myIndexControl,seeds=seeds,tuneLength.num=2,linout = TRUE)