This has got nothing to do with auto.arima()
. You are using ts
objects which do not store dates explicitly; instead they store the starting time, ending time and the frequency (the number of observations per time period). The forecast package is designed to handle ts
objects.
Since you are using tidyverse
already, you will probably find it easier to use the newer tsibble
class of objects which stores dates explicitly. Here is the same analysis you posted but done using tsibble
objects.
library(tidyverse)
library(tsibble)
library(feasts)
#> Loading required package: fabletools
library(fable)
NDX_prices <- read_csv("~/Downloads/NDX.csv") %>%
mutate(trading_day = row_number()) %>%
as_tsibble(index=trading_day)
#> Parsed with column specification:
#> cols(
#> Date = col_date(format = ""),
#> Close = col_double()
#> )
NDX_prices %>%
autoplot(Close) +
ggtitle("NASDAQ 100 ARIMA") + ylab("Closing Prices")

fit_ARIMA <- NDX_prices %>%
model(arima = ARIMA(Close))
fit_ARIMA %>% report()
#> Series: Close
#> Model: ARIMA(1,1,0)
#>
#> Coefficients:
#> ar1
#> -0.3767
#> s.e. 0.0583
#>
#> sigma^2 estimated as 25768: log likelihood=-1630.42
#> AIC=3264.83 AICc=3264.88 BIC=3271.88
fit_ARIMA %>% gg_tsresiduals()

fcast <- fit_ARIMA %>% forecast(h = 1)
fcast
#> # A fable: 1 x 4 [1]
#> # Key: .model [1]
#> .model trading_day Close .mean
#> <chr> <dbl> <dist> <dbl>
#> 1 arima 253 N(10318, 25768) 10318.
fcast %>% autoplot(NDX_prices)

Created on 2020-07-06 by the reprex package (v0.3.0)
Notes:
- Your data is a csv file, not an excel file, so you can't use
read_excel
. Instead use read_csv
.
- Because trading does not happen every day, you need to index the series by
trading_day
(number of trading days since the start of the series) rather than Date
. Otherwise the series will contain a lot of missing values.
- Similarly, the forecasts are indexed by trading day, not date. But these can be translated back to dates if you know what days will be traded in the future.