1

I have a temperature dataset of 427 days(daily temperature data) I am training the ARIMA model for 360 days and trying to predict the rest of the 67 days data and comparing the results. While fitting the model in test data I am just getting a straight line as predictions, Am i doing something wrong? `

from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(train['max'],order=(1,1,2),)
results = model.fit()
results.summary()
start = len(train)
end = len(train) + len(test) -1
predictions= pd.DataFrame()
predictions['pred'] = results.predict(start=start, end=end, typ='levels').rename('ARIMA(1,1,1) Predictions')

enter image description here

1 Answers1

2

Your ARIMA model uses the last two observations to make a prediction, that means:

  • the prediction for t(361) is based on true values of t(360) and t(359).

  • The prediction of t(362) is based on the already predicted t(361) and the true t(360).

  • The prediction for t(363) is based on two predicted values, t(361) and t(360).

  • The prediction for t(400) is based on predictions that are based on predictions that are based on predictions etc.

The prediction is based on previous predictions, and that means that forecasting errors will negatively impact new predictions. Imagine your prediction deviates only 1% for each time step, the forecasting error will become bigger and bigger the more time steps you try to predict. In such cases, the predictions often form a straight line at some point.

If you use an ARIMA(p, d, q) model, then you can forecast a maximum of q steps into the future. Predicting 67 steps into the future is a very far horizon, and ARIMA is most likely not able to do that. Instead, try to predict only the next single or few time steps.

Mario
  • 1,631
  • 2
  • 21
  • 51
Arne Decker
  • 808
  • 1
  • 3
  • 9
  • Thank you so much for explaining that. Is there a way i can predict for two timsteps, then add those predicted values to my original dataframe and train the model and predict again. Basically running the model inside a loop and updating the forecasted value each time in the main dataset. If yes can you please provide a method to do it and if no, does that mean I get the same line with constant values? – dachu darshan Mar 02 '22 at 18:26
  • You can try results.predict(start=start, end=end, typ='levels', dynamic=True). If that does not work for you, then you have to create a loop in which you do exactly what you described: train the model, make predictions, add them to the data and repeat. – Arne Decker Mar 03 '22 at 08:14