Why is historical_forecast on ARIMA model from Darts is so slow?

Question

I'm implementing an AR(7) model using the Darts library, and I want to use a rolling window technique to test my results on my dataset.

My dataset is not huge, only 1825 days (5 years, 365*5), the rolling window size is 365 and I am predicting just the next day, so forecast horizon = 1.

Using the method historical_forecast of ARIMA model, it takes a lot, like 3 minutes to return the results.

Just out of curiosity I tried to implement this backtesting technique by myself, creating the lagged dataset, and performing a simple LinearRegression() by sklearn, and at each iteration I moved the training window and predict the next day. The total time was around 5 seconds, and the results were pretty much the same of the ARIMA by Darts.

I add below a piece of reproducible code using another dataframe by Darts just to show the difference of time (0.3 secs for my arima by hand, and 9 secs for arima by Darts).

The parameters that I am using are start=48, train_length=48, forecast_horizon=1, retrain=False. See documentation here: historical_forecast

# Dependendencies
import numpy as np
import time
from darts.datasets import AirPassengersDataset
from darts.models import ARIMA
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_percentage_error

# Load dataset as a Darts TimeSeries
series = AirPassengersDataset().load()

# Convert it to a pandas dataframe
data = series.pd_dataframe()

# Take first time
time1 = time.time()

# Create the lagged columns to perform AR(7)
for i in range(1, 8):
    curr_col = "#Passengers_lag_" + str(i)
    data[curr_col] = data["#Passengers"].shift(i)

# Remove first 7 rows
data.dropna(inplace=True)

# Perform rolling_window LinearRegression()
window_size = 48
predictions_ar7_by_hand = []

for t in range(window_size, data.shape[0]):

    # Take the corresponding window of size 48 for training
    X_train, Y_train = data.iloc[(t - window_size) : t, 1:], data.iloc[(t - window_size) : t, 0]

    # Take the corresponding testing values
    X_test, Y_test = data.iloc[t: t+1, 1:], data.iloc[t: t+1, 0].values[0]

    # Fit model
    lr = LinearRegression().fit(X_train, Y_train)

    # Predict using X_test
    prediction = lr.predict(X_test)[0]

    print('------------------------------')
    print(f"Test value: {Y_test}")
    print(f"Predicted value: {prediction}")
    
    predictions_ar7_by_hand.append(prediction)


print('**************** Finished ARIMA by hand **********************')
print(f"Time: {time.time()-time1}")

# Take second time
time2 = time.time()

# Forecast using ARIMA by Darts
ar7 = ARIMA(p=7, d=0, q=0)

predictions_ar7 = ar7.historical_forecasts(series, start=48, train_length=48, forecast_horizon=1, retrain=False)

print('**************** Finished ARIMA by DARTS **********************')
print(f"Time: {time.time()-time2}")

# Convert it to a pandas dataframe
predictions_ar7 = predictions_ar7.pd_dataframe()

# Insert the results in the same df
predictions_ar7['#Passengers_arima_by_hand'] = [np.nan for x in range(7)] + predictions_ar7_by_hand

# Rename some columns
predictions_ar7.rename(columns={'#Passengers': '#Passengers_arima_by_darts'}, inplace=True)

# Add original data 
predictions_ar7['#Passengers'] = data['#Passengers']

# Remove NaN, only 7 rows
predictions_ar7.dropna(inplace=True)

# Print the results

print("*********** RESULTS **********")

mape1 = mean_absolute_percentage_error(predictions_ar7['#Passengers'], predictions_ar7['#Passengers_arima_by_darts'])
mape2 = mean_absolute_percentage_error(predictions_ar7['#Passengers'], predictions_ar7['#Passengers_arima_by_hand'])
mape3 = mean_absolute_percentage_error(predictions_ar7['#Passengers_arima_by_darts'], predictions_ar7['#Passengers_arima_by_hand'])
print(f"MAPE by Darts: {mape1}, MAPE by hand: {mape2}, MAPE between two arima: {mape3}")

I'm really curious about the reason behind these differences in time, because on this small dataset the differences are stil ok, but on my original one it took like 3 minutes, and its really weird compared to my simple and basic implementation.

If anyone have any idea on the reason I would love to hear it.

Why is historical_forecast on ARIMA model from Darts is so slow?

0 Answers0