2

The output am trying for is to make a loop of (i, j, k) where i and k takes values [0, 5] and j from [0, 3]. The loop would run on values like:

(0, 0, 0)
(0, 0, 1)
(0, 0, 2)
(0, 0, 3)
(0, 0, 4)
(0, 0, 5)
(0, 1, 0)
(0, 1, 1)
(0, 1, 2)
.
.
.
(5, 3, 5)

Basically I want to run arima (p, d, q) model making loop and extract RMSE value from there.

The code for arima I tried is,

fit <- arima(df.train$Positive, order=c(0, 0, 0),include.mean = FALSE)
S <- as.data.frame(summary(fit))
S$RMSE

The "S$RMSE" gives the RMSE value. But help me in running the loop of "order= c(i, j, k)" and get this RMSE value automatically.

The result I want is finally cbind these two and make a table like,

Order      RMSE
(0, 0, 0)  xxxx
(0, 0, 1)  xxxx
(0, 0 ,2)  xxxx
Sukanya Acharya
  • 81
  • 2
  • 10
  • 1
    What have you tried? Are you going to post a [reproducible example](https://stackoverflow.com/a/5963610/5619526) so that others may experiment with a data set? In the mean time, try to use `apply` with `MARGIN = 1` to loop over the rows of `expand.grid(i = 0:5, j = 0:3, k = 0:5)` – bouncyball May 18 '18 at 17:21
  • I want to understand how to write a loop in the variable created fit. Am new to concepts such as loop. Rest I did not thought data is important here. my data though looks like: Year- (2015, 2014, ...) Positive- (19904, 19815, ...) – Sukanya Acharya May 18 '18 at 17:35

4 Answers4

2

Without having access to your data, it's impossible to test if the following code solves your problem.

Try using the apply function to loop over the rows of a matrix defined for i, j, and k using expand.grid:

param_data <- expand.grid(i = 0:5, j = 0:3, k = 0:5)

param_data2 <- cbind(param_data, 
      apply(param_data, 1,
      FUN = function(x){
        fit <- arima(df.train$Positive, 
                     order = x,
                     include.mean = FALSE)
        S <- as.data.frame(summary(fit))
        S$RMSE
      })
)
bouncyball
  • 10,631
  • 19
  • 31
0

This does not answer your exact question, however I believe you might use auto.arima function from forecast package, which can estimate best ARIMA model alone.

You can set max (p,q,d) values there as well.

Ugly, but easy solution - I would use triple for loop:

order <- c()
RMSEs <- c()
for (i in 1:5) {
  for (j in 1:5) {
    for (k in 1:5) {
      order_temp <- sprintf('(%s, %s, %s)', i, j, k)
      order <- c(order, order_temp)
      fit <- arima(df.train$Positive, order=c(i, j, k),include.mean = FALSE)
      S <- as.data.frame(summary(fit))
      RMSEs <- c(RMSEs, S$RMSE)
    }
  }
}
result <- as.data.frame(order)
result$RMSE <- RMSEs
johnnyheineken
  • 543
  • 7
  • 20
  • 1
    Hi @johnnyheineken , I want to compare the results of RMSE generated for all the combinations of (p,d,q). So auto.arima would not give that. – Sukanya Acharya May 18 '18 at 18:14
  • Error in arima(df.train$Positive, order = c(i, j, k), include.mean = FALSE) : non-stationary AR part from CSS In addition: Warning message: In arima(df.train$Positive, order = c(i, j, k), include.mean = FALSE) : possible convergence problem: optim gave code = 1 Gives this result. Am not sure why they separately are working and not in the loop. – Sukanya Acharya May 18 '18 at 18:40
0

Exmaple your data

set.seed(1)
df.train = data.frame("Month/Year" = paste0(month.abb,"/",rep(12:18,each=12)), Positive = rnorm(84,5000,1500))
head(df.train)
  Month.Year Positive
1     Jan/12 4060.319
2     Feb/12 5275.465
3     Mar/12 3746.557
4     Apr/12 7392.921
5     May/12 5494.262
6     Jun/12 3769.297

The order(p,I,q) of the arima model

library(gtools)
param = permutations(n=6,r=3,v=0:5,repeats.allowed=T)
param = cbind(param[param[,2] <= 3,],0)
colnames(param) <- c("p","I","q","RMSE")
param

function that calculates the RMSE of the adjusted arima model

library(forecast)
RMSE = function(param){
  for(i in 1:nrow(param)){
  s <-data.frame(summary(Arima(df.train$Positive, order=param[i,1:3],include.mean = FALSE, method = "ML")));
  param[i,4] <- s$RMSE
  }
  return(param)
  }

the result

result = RMSE(param)
head(result)
     p I q     RMSE
[1,] 0 0 0 5308.368
[2,] 0 0 1 3536.816
[3,] 0 0 2 2820.933
[4,] 0 0 3 2555.799
[5,] 0 0 4 2438.050
[6,] 0 0 5 2151.455

Note: For this case the best model is an ARIMA (4,1,5), according to the criteria of the RMSE

result[which(result[,4] == min(result[,4])),]
      p       I       q    RMSE 
   4.00    1.00    5.00 1226.28
Rafael Díaz
  • 2,134
  • 2
  • 16
  • 32
  • Received this error, "Error in param[i, 4] <- s$RMSE : number of items to replace is not a multiple of replacement length" – Sukanya Acharya May 18 '18 at 19:03
  • This line basically clears the environment. I had ran this line and tried the whole code again. Gave the same error – Sukanya Acharya May 18 '18 at 19:44
  • run this line of code `remove(list = ls())`. Now copy all the code again and now run the code. load packages `library(forecast)` use function `Arima` in the bucle for careful put `Arima` the forecast with the capital `A` and run again all – Rafael Díaz May 18 '18 at 20:30
  • For some data, "Error in solve.default(res$hessian * n.used, A) : Lapack routine dgesv: system is exactly singular: U[1,1] = 0" -> this error is coming. Can anyone answer here why this happens? – Sukanya Acharya May 22 '18 at 10:13
  • Create a new question regarding that concern, and add if possible the data you are using or a sample. – Rafael Díaz Jun 14 '18 at 23:51
0

Code am using,

Demo <- read.csv("C:/UsersMP.csv", header = TRUE)
Dem <- data.frame(Demo)
smp_size <- floor(0.95 * nrow(Dem))
df.train <- Dem[1:smp_size, ]
df.test <- Dem[(smp_size+1):nrow(Dem), ]
fit <- arima(df.train$Positive, order=c(0, 0, 0),include.mean = FALSE)
S1 <- as.data.frame(summary(fit))
S1$RMSE
fit1 <- arima(df.train$Positive, order=c(0, 0, 1),include.mean = FALSE)
S2 <- as.data.frame(summary(fit))
S2$RMSE
fit2 <- arima(df.train$Positive, order=c(0, 0, 2),include.mean = FALSE)
S3 <- as.data.frame(summary(fit))
S3$RMSE

It is working properly in this. I just want to avoid this task of repeating this and form a loop of (i, j, k).

Sukanya Acharya
  • 81
  • 2
  • 10