I have the below R
script which takes more than 24 hours to but finally runs on Windows 10
of 10-gigabyte ram
and core M7
. The script does the following:
Here is what I desire to do with R
A. I have generated a 50-time series dataset.
B. I slice the same time series dataset into chunks of the following sizes:
2,3,...,48,49
making me have 48 different time series formed from step 1 above.C. I divided each 48-time series dataset into
train
andtest
sets so I can usermse
function inMetrics
package to get the Root Mean Squared Error (RMSE) for the 48 subseries formed in step 2.D. The RMSE for each series is then tabulated according to their chunk sizes
E. I obtained the best
ARIMA
model for each 48 different time series data set.
My R Script
# simulate arima(1,0,0)
library(forecast)
library(Metrics)
n=50
phi <- 0.5
set.seed(1)
wn <- rnorm(n, mean=0, sd=1)
ar1 <- sqrt((wn[1])^2/(1-phi^2))
for(i in 2:n){
ar1[i] <- ar1[i - 1] * phi + wn[i]
}
ts <- ar1
t <- length(ts) # the length of the time series
li <- seq(n-2)+1 # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
# vector to store block means
RMSEblk <- matrix(nrow = 1, ncol = length(li))
colnames(RMSEblk) <-li
for (b in 1:length(li)){
l <- li[b]# block size
m <- ceiling(t / l) # number of blocks
blk <- split(ts, rep(1:m, each=l, length.out = t)) # divides the series into blocks
# initialize vector to receive result from for loop
singleblock <- vector()
for(i in 1:1000){
res<-sample(blk, replace=T, 10000) # resamples the blocks
res.unlist<-unlist(res, use.names = F) # unlist the bootstrap series
# Split the series into train and test set
train <- head(res.unlist, round(length(res.unlist) * 0.6))
h <- length(res.unlist) - length(train)
test <- tail(res.unlist, h)
# Forecast for train set
model <- auto.arima(train)
future <- forecast(test, model=model,h=h)
nfuture <- as.numeric(future$mean) # makes the `future` object a vector
RMSE <- rmse(test, nfuture) # use the `rmse` function from `Metrics` package
singleblock[i] <- RMSE # Assign RMSE value to final result vector element i
}
RMSEblk[b] <- mean(singleblock) # store into matrix
}
RMSEblk
The R
script actually runs but it takes more than 24 hours to complete. The number of runs in the loops
(10000 and 1000) are the minimum that is necessary to make the task perfect.
Please what can I do to make the script complete in less time?