0

Below I am forecasting for the next 30 days . If the input data is around 100k the for loop is extremely slow (takes about 2 hours) . the code using the for loop as below.

ns<-ncol(TS) # count number of columns to run the loop

output<-matrix(NA,nrow=30,ncol=ns) 

for (i in 2:ns)
{   
  output[,i]<- forecast(auto.arima(TS[,i],allowmean = T,D=1),h=30 )$mean 
  i=i+1
}

I have tried using lapply as below but the run time remains the same.

lapply(TS, function(x) forecast(auto.arima(x,allowmean = T,D=1),h=30 ))

Is there an alternate function/method I can use to improve the performance?

dagan
  • 215
  • 1
  • 5
  • 15
  • 1
    Please provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Emmanuel-Lin Jul 03 '18 at 09:37
  • 3
    The runtime for apply family and the explicit forloop are morelessly the same. Thereis no gain of efficiency when usingapply family. They arejust implicit for-loops – Onyambu Jul 03 '18 at 09:38
  • 2
    That `i = i+1` does absolutely nothing. – LAP Jul 03 '18 at 09:38
  • In `for` loops you don't need to put `i = i+1` – Emmanuel-Lin Jul 03 '18 at 09:38
  • 1
    try some parallel lapply – s.brunel Jul 03 '18 at 09:43
  • The internal `apply` functions are de-facto deprecated now. Use the package `purrr` for single thread and `furrr` for parallel loops. edit: and as @Onyambu said there are no efficiency gains over `for` unless you use multiple cores. – Juergen Jul 03 '18 at 09:46
  • The [discussion](https://stackoverflow.com/questions/28983292/is-the-apply-family-really-not-vectorized) here may be useful – VicaYang Jul 03 '18 at 12:30

0 Answers0