2

As it is large I can't dput it here. But suppose the realmatrix is a "mts" with non-trivial values

realmatrix <- matrix(NA, ncol = 100, nrow = 138)

In fact it stores 100 time series with length (rows) = 138 (from Jan 2005 to June 2016).

I want to store the Arima forecasts (12 months ahead: that is, from July 2016 to June 2017) in another matrix farimamatrix (which should have 12 rows and 100 columns), via the following loop:

farimamatrix <- matrix(NA, nrow = 12, ncol = 100)

m <- k <- list()

for (i in 1:100) {
  try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
  k[[i]] <- forecast.Arima(m[[i]], h=12)
  farimamatrix[,i] <- fitted(k[[i]])
  }

But I am getting the following message:

Error in farimamatrix[, i] <- fitted(k[[i]]) :

incorrect number of subscripts on matrix

What's wrong? Thanks in advance.


Edited (24/10): updated / corrected under Zheyuan's answer and previous problem gone

Original data:

tsdata <- 
structure(c(28220L, 27699L, 28445L, 29207L, 28482L, 28326L, 28322L, 
28611L, 29187L, 29145L, 29288L, 29352L, 28881L, 29383L, 29898L, 
29888L, 28925L, 29069L, 29114L, 29886L, 29917L, 30144L, 30531L, 
30494L, 30700L, 30325L, 31313L, 32031L, 31383L, 30767L, 30500L, 
31181L, 31736L, 32136L, 32654L, 32305L, 31856L, 31731L, 32119L, 
31953L, 32300L, 31743L, 32150L, 33014L, 32964L, 33674L, 33410L, 
31559L, 30667L, 30495L, 31978L, 32043L, 30945L, 30715L, 31325L, 
32262L, 32717L, 33420L, 33617L, 34123L, 33362L, 33731L, 35118L, 
35027L, 34298L, 34171L, 33851L, 34715L, 35184L, 35190L, 35079L, 
35958L, 35875L, 35446L, 36352L, 36050L, 35567L, 35161L, 35419L, 
36337L, 36967L, 36745L, 36370L, 36744L, 36303L, 36899L, 38621L, 
37994L, 36809L, 36527L, 35916L, 37178L, 37661L, 37794L, 38642L, 
37763L, 38367L, 38006L, 38442L, 38654L, 38345L, 37628L, 37698L, 
38613L, 38525L, 39389L, 39920L, 39556L, 40280L, 41653L, 40269L, 
39592L, 39100L, 37726L, 37867L, 38551L, 38895L, 40100L, 40950L, 
39838L, 40643L, 40611L, 39611L, 39445L, 38059L, 37131L, 36697L, 
37746L, 37733L, 39188L, 39127L, 38554L, 38219L, 38497L, 39165L, 
40077L, 38370L, 37174L), .Dim = c(138L, 1L), .Dimnames = list(
    NULL, "Data"), .Tsp = c(2005, 2016.41666666667, 12), class = "ts")

Code

library("forecast")

z <- stl(tsdata[, "Data"], s.window="periodic")

t <- z$time.series[,"trend"]
s <- z$time.series[,"seasonal"]
e <- z$time.series[,"remainder"]

# error matrix
ematrix <- matrix(rnorm(138 * 100, sd = 100), nrow = 138)

# generating a ts class error matrix
ematrixts <- ts(ematrix, start=c(2005,1), freq=12)

# combining the trend + season + error matrix into a real matrix
realmatrix <- t + s + ematrixts

# creating a (forecast) arima matrix
farimamatrix <- matrix(NA, ncol = 100, nrow = 12)

m <- k <- vector("list", length = 100)

for (i in 1:100) {
  try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
  print(i)
  k[[i]] <- forecast.Arima(m[[i]], h = 12)
  farimamatrix[,i] <- k[[i]]$mean
  }

# ts.plot(farimamatrix[,1:100],col = c(rep("gray",100),rep("red",1)))

The loop seems to work, but breaks down after a few iterations due to failure of Arima:

Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, : " non-stationary seasonal AR part from CSS

Community
  • 1
  • 1
Mushrambo
  • 133
  • 3
  • 8
  • (1) Can you paste the output from `dput(head(realmatrix, 20))` (or give us a fake matrix to work with that still produces the error message) so we can reproduce your problem? See [How to make a great R reproducible example](http://stackoverflow.com/q/5963269/6455166). (2) `fitted()` is going to give you a vector of length `nrow(realmatrix)`, but `farimamatrix` has 12 rows. What is the desired contents of `farimamatrix`? Is it the forecasts? – Weihuang Wong Oct 22 '16 at 19:18
  • @WeihuangWong sure. (1) I've included the dput in the edited post. (2) Yes, the purpose of the farimatrix is to store the forecasts (12 rows/months) of each fitted arima. – Mushrambo Oct 22 '16 at 21:39
  • The code is incomplete without all library statements. – G. Grothendieck Oct 22 '16 at 21:59
  • `realmatrix` seems to be a vector, based on the `dput` -- is that right? That confuses me because both the object name and `realmatrix[,i]` in your script suggests to me that it is supposed to be a matrix. – Weihuang Wong Oct 22 '16 at 22:19
  • 1
    it is matrix indeed. It stores 100 time series with length (rows) = 138 (from Jan 2005 to June 2016). I want to store the arima forecasts (12 months ahead: that is, from July 2016 to June 2017) in another matrix (which should have 12 rows and 100 columns, naturally) – Mushrambo Oct 22 '16 at 22:49

1 Answers1

1

Yep, the previous problem is gone, and now you have a new problem, regarding the failure of Arima. Strictly speaking you should raise a new question on this. But I will answer it here anyway.

The error message is quite illustrative. When you fit a model ARIMA(0,1,0)(1,0,1), sometimes the seasonal part is non-stationary, so a further seasonal differencing is needed.

By looking at ts.plot(realmatrix),I see that all 100 columns of realmatrix are pretty similar. I will thus take out the first column for some analysis.

x <- realmatrix[,1]

Obviously the non-seasonal differencing is a must, but do we need a seasonal differencing as well? Have a check with ACF

acf(diff(x))

enter image description here

We actually spotted strong evidence that for the seasonal pattern. So yes, a seasonal differencing is needed.

Now let's check the ACF after both differencing:

acf(diff(diff(x, lag = 12)))  ## first do seasonal diff, then non-seasonal diff

enter image description here

There appears to be a negative spike between season, suggesting a seasonal MA process. So ARIMA(0,1,0)(0,1,1)[12] would be a good bet.

fit <- arima(x, order = c(0,1,0), seasonal = c(0,1,1))

Have a check at the residuals:

acf(fit$residuals)

enter image description here

I would actually be pretty happy about this result, as there is no lag 1 or even lag 2 autocorrelation at all, and there is also no seasonal autocorrelation. You can actually try further adding a seasonal and / or non-seasonal AR(1), but there will be no improvement. So this is our final model to go.

So use the following loop:

farimamatrix <- matrix(NA, ncol = 100, nrow = 12)

m <- k <- vector("list", length = 100)

for (i in 1:100) {
  m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(0,1,1))
  print(i)
  k[[i]] <- forecast.Arima(m[[i]], h = 12)
  farimamatrix[,i] <- k[[i]]$mean
  }

Now all 100 model fitting are successful.

---------

A retrospect reflection

Perhaps I should explain why ARIMA(0,1,0)(1,0,1)[12] models works for my simulated data in the initial answer. Because note how I simulate my data:

seasonal <- rep_len(sin((1:12) * pi / 6), 138)

Yes, the underlying seasonal pattern is a true replication and of course stationary.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248