I am a beginner in R and appreciate any help or tip to develop a function/loop to automate forecasting process below: Here's a dummy data set
> class(stack_help)
[1] "data.frame"
> stack_help
OO GG CC DD
1 198.12 60.56 265.5 271.24
2 145.68 52.28 328.9 427.68
3 106.48 47.08 380.24 695.60
4 83.16 43.52 443.94 934.30
5 89.72 46.68 484.6 1084.26
6 86.48 35.46 415.56 924.68
7 93.68 24.40 376.42 798.14
8 101.70 22.68 260.42 427.72
9 115.88 22.00 228.26 245.72
10 137.24 21.60 212.7 140.64
11 129.82 18.78 230.02 46.04
12 145.00 17.62 220.74 47.16
13 135.38 18.84 245.52 143.28
14 146.38 20.68 322.18 490.20
15 154.08 19.48 374.16 621.48
16 149.34 22.68 484.28 999.50
17 152.74 28.90 533.26 1223.58
18 148.62 27.44 456.76 974.44
19 158.54 23.90 417.52 820.54
20 169.96 27.08 306.16 498.02
21 152.50 33.74 283.1 309.22
22 149.68 38.44 224.54 123.82
23 149.48 38.94 215.28 30.26
24 153.38 36.24 193.18 75.46
25 155.58 37.88 243.34 228.92
26 165.84 37.00 318.08 528.58
27 171.34 38.96 393.6 707.04
28 183.60 48.20 531.62 1169.40
29 192.58 44.46 507.96 1037.22
30 207.92 43.52 435 956.96
31 228.88 47.44 399.58 788.78
32 246.14 45.74 262.84 397.66
33 228.92 45.98 240.8 255.32
34 227.52 45.22 211.44 96.02
35 232.92 43.02 203.08 62.18
36 220.16 43.88 188.56 63.74
37 221.76 46.78 210.58 131.28
38 218.94 45.10 272.36 438.64
39 221.00 47.48 351.58 689.90
40 215.82 44.68 402.82 854.80
41 222.32 43.74 435.06 1013.92
42 239.40 52.26 474.24 1128.04
43 249.86 47.62 324.92 689.40
44 240.92 49.60 289.82 538.98
45 221.04 48.40 218.74 256.80
46 191.18 47.34 192.36 136.84
47 206.28 48.66 188.22 60.60
48 226.68 48.12 174.54 58.36
49 226.76 51.66 204.26 190.58
50 223.94 53.40 272.22 454.56
51 219.42 54.50 339.26 647.94
52 219.36 54.68 #VALUE! 1040.08
53 225.94 53.06 462.82 1066.12
54 233.04 52.64 425.32 916.22
55 218.48 64.22 438.06 961.36
56 205.76 56.44 292.24 534.28
57 206.06 53.42 225.32 272.24
58 206.22 52.50 190.2 117.16
59 215.44 52.14 182.12 32.56
60 221.92 51.10 175.82 47.50
Appreciate any suggestions for improvement of the process below and hot to use the apply function or a loop function to automate it.
The column OO is the variable I wnat to make create a forecast model with .
The other columns are predictors that I want to test if the forecast works better with them or with only the past data of OO.
I took 36 observations to fit an Arima model with function auto.arima from the "forecast" package .
The function provides some model parameters p,d,q, , let's say 0,1,0
Now I want to test the model in an automated way and perform the below :
a. forecast the next period ahead , On the data table above would be equivalent to the row 37 .
b. take the results of the forecast and compare with the historical data , the row 37 , column OO.
c. call the accuracy function from the package "forecast" and compare with the data point row 37 . PLus , Store the error measures in a vector.
d. Update the 'xdata' argument adding the historical point 37 and also the 'xreg' argument with one more month for the predictor and call another forecast for the next period and redo this process until I complete a test of 24 forecasts.
Although I fitted the model with the package "forecast" I found easier to use the function 'sarima.for' from the package astsa.
Before the code , still more info :
- Train.OO would be a time series object of the first 36 observations of the data table above
- n.ahead = argument of the horizon of the forecast : 1 period in this case
- 0,1,0 would be the ARIMA model (p,d,q)
- Train.GG would be the predictor variable , teh first 36 observations of column GG
- newxreg is just a cut of one data point in TS object from the data table that would be the predictor of forecast.
Now the code
fc.1 <- sarima.for(
xdata = Train.00,
n.ahead = 1, 0, 1, 0 ,
xreg = Train.GG,
newxreg = window(ts(slack_help$GG, start = c(2009,1),
frequency = 12), start = c(2012,1) , end = c(2012,11)))
fc.1
fc.1.acc <- accuracy(fc.1$pred,
window(ts((slack_help$OO), start = c(2009,1),frequency =
12), start = c(2012,1), end = c(2012,1), frequency =12)
Now the second command :
fc.2 <- sarima.for(
xdata = window(ts((slack_help$OO), start = c(2009,1),frequency = 12),
start = c(2009,1), end = c(2012,1), frequency =12),
n.ahead = 1, 0, 1, 0 ,
xreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12),
start = c(2009,1), end = c(2012,1), frequency =12),
newxreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12),
start = c(2009,2), end = c(2012,2), frequency =12),
fc.2
fc.2.acc <- accuracy(fc.2$pred,
window(ts((slack_help$OO), start = c(2009,1),frequency =
12), start = c(2012,2), end = c(2012,2), frequency =12)
fc.2.acc
And I did this for the the following forecasts. Basically the same code , Just updated the dates of the window functions to cut the right time series to be considered on the forecast.
Total 24 calls.
I know this was inefficient "brute force". However, I am a bit lost on how to start to develop the function/loop. Appreciate any comment or tip on how to automate the steps mentioned. Thanks in advance!