I'm trying to forecast future load and renewable load factors of the French power system.
I have a database of 61368 hourly observations (8 years * 8760 hours or so) from 2012 to 2018 for each variable (Consumption, Wind load factor, PV load factor). I want to use these past observations to forecast them until 2024.
To do this, I read from Hyndman's book that I could use an ARIMA
model with Fourier terms as aggressors since my variables present multiple seasonalities and periods over 200 (at least for the annual seasonality).
The problem I have is to find the optimal number of K's aka number of Fourier terms in a reasonable amount of time.
I tried to use auto.arima(consumption, seasonal = FALSE, xreg = fourier(consumption, K = c(12, 84, 4380), h = 8760*6))
but it takes hours to solve.
So here's what I'm trying now :
#Creating msts objects for main variables
demand <- msts(dataset$Consommation, seasonal.periods = c(24,168,8760))
#Seasonality detection for main variables
ndiffs(demand)
p_demand <- periodogram(demand)
data.table(period=1/p_demand$freq, spec=p_demand$spec)[order(-spec)][1:2]
#Splitting time series into train and test sets
train_demand <- ts(demand[1:55230])
test_demand <- demand[55231:61368]
#Creation of a base model
fit0_demand <- auto.arima(train_demand)
(bestfit_demand <- list(aicc=fit0_demand$aicc, i=0, j=0, fit_demand=fit0_demand))
#Forecasting and ploting base model
fc0_demand <- forecast(fit0_demand, h= 8760*6)
plot(fc0_demand)
#Choosing the best maximal number of Fourier terms by AICc
for (i in 1:5) {
for (j in 1:5) {
z1_demand <- fourier(ts(train_demand, frequency = 8777.143), K = i)
z2_demand <- fourier(ts(train_demand, frequency = 24), K = j)
fit_demand < auto.arima(train_demand, xreg = cbind(z1_demand,z2_demand), seasonal = FALSE)
if(fit_demand$aicc < bestfit_demand$aicc){
bestfit_demand <- list(aicc=fit_demand$aicc,i=i,j=j, fit_demand=fit_demand)
}
}
}
bestfit_demand
#Forecasting with best fits
fc_demand <- forecast(bestfit_demand$fit, xreg=cbind(fourier(ts(train_demand, frequency = 8777.143), K = bestfit_demand$i, h = 6*8760), fourier(ts(train_demand, frequency = 24), K = bestfit_demand$j, h = 6*8760)))
plot(fc_demand)
Actually there are no error messages but my for loop take hours and hours (I launched the code 17h ago and still working on the loop) to solve. Is there any faster way to to this ? I have an i5
, 8GB RAM
laptop btw.
Thanks for the help !
PS : I didn't include the loading packages and data code parts.