I'm implementing an AR(7) model using the Darts library, and I want to use a rolling window technique to test my results on my dataset.
My dataset is not huge, only 1825 days (5 years, 365*5), the rolling window size is 365 and I am predicting just the next day, so forecast horizon = 1.
Using the method historical_forecast
of ARIMA model, it takes a lot, like 3 minutes to return the results.
Just out of curiosity I tried to implement this backtesting technique by myself, creating the lagged dataset, and performing a simple LinearRegression()
by sklearn, and at each iteration I moved the training window and predict the next day. The total time was around 5 seconds, and the results were pretty much the same of the ARIMA by Darts.
I add below a piece of reproducible code using another dataframe by Darts just to show the difference of time (0.3 secs for my arima by hand, and 9 secs for arima by Darts).
The parameters that I am using are start=48, train_length=48, forecast_horizon=1, retrain=False
. See documentation here: historical_forecast
# Dependendencies
import numpy as np
import time
from darts.datasets import AirPassengersDataset
from darts.models import ARIMA
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_percentage_error
# Load dataset as a Darts TimeSeries
series = AirPassengersDataset().load()
# Convert it to a pandas dataframe
data = series.pd_dataframe()
# Take first time
time1 = time.time()
# Create the lagged columns to perform AR(7)
for i in range(1, 8):
curr_col = "#Passengers_lag_" + str(i)
data[curr_col] = data["#Passengers"].shift(i)
# Remove first 7 rows
data.dropna(inplace=True)
# Perform rolling_window LinearRegression()
window_size = 48
predictions_ar7_by_hand = []
for t in range(window_size, data.shape[0]):
# Take the corresponding window of size 48 for training
X_train, Y_train = data.iloc[(t - window_size) : t, 1:], data.iloc[(t - window_size) : t, 0]
# Take the corresponding testing values
X_test, Y_test = data.iloc[t: t+1, 1:], data.iloc[t: t+1, 0].values[0]
# Fit model
lr = LinearRegression().fit(X_train, Y_train)
# Predict using X_test
prediction = lr.predict(X_test)[0]
print('------------------------------')
print(f"Test value: {Y_test}")
print(f"Predicted value: {prediction}")
predictions_ar7_by_hand.append(prediction)
print('**************** Finished ARIMA by hand **********************')
print(f"Time: {time.time()-time1}")
# Take second time
time2 = time.time()
# Forecast using ARIMA by Darts
ar7 = ARIMA(p=7, d=0, q=0)
predictions_ar7 = ar7.historical_forecasts(series, start=48, train_length=48, forecast_horizon=1, retrain=False)
print('**************** Finished ARIMA by DARTS **********************')
print(f"Time: {time.time()-time2}")
# Convert it to a pandas dataframe
predictions_ar7 = predictions_ar7.pd_dataframe()
# Insert the results in the same df
predictions_ar7['#Passengers_arima_by_hand'] = [np.nan for x in range(7)] + predictions_ar7_by_hand
# Rename some columns
predictions_ar7.rename(columns={'#Passengers': '#Passengers_arima_by_darts'}, inplace=True)
# Add original data
predictions_ar7['#Passengers'] = data['#Passengers']
# Remove NaN, only 7 rows
predictions_ar7.dropna(inplace=True)
# Print the results
print("*********** RESULTS **********")
mape1 = mean_absolute_percentage_error(predictions_ar7['#Passengers'], predictions_ar7['#Passengers_arima_by_darts'])
mape2 = mean_absolute_percentage_error(predictions_ar7['#Passengers'], predictions_ar7['#Passengers_arima_by_hand'])
mape3 = mean_absolute_percentage_error(predictions_ar7['#Passengers_arima_by_darts'], predictions_ar7['#Passengers_arima_by_hand'])
print(f"MAPE by Darts: {mape1}, MAPE by hand: {mape2}, MAPE between two arima: {mape3}")
I'm really curious about the reason behind these differences in time, because on this small dataset the differences are stil ok, but on my original one it took like 3 minutes, and its really weird compared to my simple and basic implementation.
If anyone have any idea on the reason I would love to hear it.