3

I am trying to evaluate and stack learning algorithms using Time Series Cross Validation using caret and caretEnsemble. caret directly supports this via method="timeslice" - however, caret ensemble does not

caretList does not currently know how to handle cross-validation method='timeslice'. Please specify trControl$index manually

and it seems to be possible to create the indices manually using

createTimeSlices

however, this yields the following error in the minimal working example below

Error: Stopping In addition: Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures.

library(tidyverse)
library(caret)
library(caretEnsemble)

dfml=data.frame(date=seq(as.Date("2014/09/04"), by = "day", length.out = 1090),y=rnorm(1090),x=rnorm(1090))
index=createTimeSlices(dfml$y,365,horizon = 1,fixedWindow = TRUE)

n=726
seeds <- vector(mode = "list", length = n) # creates an empty vector containing lists
for(i in 1:(n-1)){seeds[[i]] <- sample.int(1000, 3) }
seeds[[n]] <- sample.int(1000, 1)

myIndexControl <- trainControl(method = "cv",
                               allowParallel = TRUE,
                               seeds = seeds,index=index$train,indexOut=index$test)
alg_list <- c("glmnet", "gbm", "lm")
multi_mod <- caretList(y ~ . , 
                       data = dfml, 
                       trControl = myIndexControl, 
                       methodList = alg_list,
                       family="gaussian",
                       metric = "RMSE", seeds=seeds)

Any suggestion or workaround would be greatly appreciated. A single model can be tuned this way

glmnet.fit =train( y~ .,
               data = df,
               method = "glmnet",
               verbose=FALSE, trControl = myIndexControl,seeds=seeds,tuneLength.num=2,linout = TRUE)
Mac
  • 63
  • 1
  • 6

0 Answers0