-1

This R code predicts sp500 with arima model,

rm(list=ls())
load("StockData.Rdata")
library(seasonal)
library(forecast)
library(tseries)
library(astsa)
sp500 = StockData[,1]
model = auto.arima(sp500,
                     max.p=52,max.q=52,max.d=52,
                     seasonal = TRUE,
                     lambda = "auto", biasadj = TRUE)
predict = forecast(model,biasadj=TRUE, h=2)

Here, I want to use auto.arima with Box-Cox transformation, so I set lambda = "auto", and biasadj = TRUE. However, the predicted value has mean NaN:

> forecast(model,biasadj=TRUE, h=2)
    Point Forecast      Lo 80     Hi 80      Lo 95     Hi 95
209            NaN -0.1278079 0.1206801 -0.1962931 0.1890652
210            NaN -0.1278079 0.1206801 -0.1962931 0.1890652

I have checked some GitHub questions and some people say that it is because there is NA value in the residual so the point forecast is NaN. However, the residual in model doesn't contain NaN so it should not be the case. So, how can I solve that?

Also, why is the predicted value same for t=209 and t=210?

Ray Jio
  • 21
  • 2
  • It's difficult to debug without having access to your data (see [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for tips on how to do that). How does your `model` fit the data? On a side note: Some of your `auto.arima` parameters are not very sensible; for example, setting the maximum degree of differencing to 52 is just nonsense. Obviously it depends on the area of work and the underlying data generation process but I'd be suspicious of any d>3 ARIMA fit. The `max.p` and `max.q` args also seem way too high. – Maurits Evers Apr 22 '21 at 06:05
  • @MauritsEvers I upload the data: https://github.com/118020071/bankcruptcy_data/blob/main/StockData.RData – Ray Jio Apr 22 '21 at 06:28

1 Answers1

0

I have a feeling there is a misunderstanding about what biasadj does, but I think you need to take a step back and inspect the original data.

Let's plot sp500:

library(ggplot2)
ggplot(data.frame(x = seq_along(sp500), y = sp500), aes(x, y)) +
    geom_line()

enter image description here

A cursory inspection suggests no seasonality and no drift/linear trend; there may be some indication of heteroskedasticity which may warrant a Box-Cox transformation.

Keeping model parsimony in mind, I would start with a default call to auto.arima allowing for a automatic Box-Cox transformation

library(forecast)
model <- auto.arima(sp500, lambda = "auto")
#Series: sp500 
#ARIMA(0,0,1) with non-zero mean 
#Box Cox transformation: lambda= 0.7637886 
#
#Coefficients:
#    ma1     mean
#-0.1142  -1.3103
#s.e.   0.0582   0.0041
#
#sigma^2 estimated as 0.00548:  log likelihood=308.93
#AIC=-611.86   AICc=-611.77   BIC=-601.18

auto.arima determines the optimal model to be ARIMA(0,0,1) on Box-Cox-transformed data with lambda = 0.76, which corresponds to a simple first-order moving average model MA(1). This seems to be consistent with the earlier visual assessment.

When forecasting, forecast will automatically show estimates on the original scale, see e.g.

autoplot(forecast(model, h = 10))

enter image description here

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • After several trails, I just find that biasadj is just for adjusting the training data according to the lambda value, and adding lambda=model$lambda into forecast() parameters will give valid mean not NaNs. – Ray Jio May 11 '21 at 08:30
  • @RayJio No, that is not how `biasadj` works. `forecast` will give values on the original (i.e. pre-Box-Cox transformed) scale either way. `biasadj` introduces an adjustment that ensures that back-transformed values correspond to the mean of the forecast distribution rather than the median. You can find some details [here](https://math.stackexchange.com/questions/2974566/bias-adjustment-for-the-box-cox-back-transformation), [here](https://harlecin.netlify.app/post/box-cox-and-other-transformations/) and [here](https://otexts.com/fpp2/transformations.html). – Maurits Evers May 15 '21 at 13:44
  • [continued] If the forecast distribution is symmetric (e.g. more or less normal) then there is nothing to adjust, and forecasts with and without bias adjustment will be (more or less) the same. More importantly in your case is that there is really not much to forecast here. Data don't show any seasonality, and BC-transformed data suggests a stationary process with a weak MA(1) term. – Maurits Evers May 15 '21 at 13:52
  • 1
    Thanks for your patience. Now I can successfully forecast. – Ray Jio May 16 '21 at 03:39