2

Purpose

I want to predict daily volatility by EGARCH(1,1) model using arch package.
Interval of Prediction: 01-04-2015 to 12-06-2018 (mm-dd-yyyy format)

hence i should grab data (for example) from 2013 till 2015 to fit EGARCH(1,1) model on it, and then predict daily volatility for 01-04-2015 to 12-06-2018


Code

so i tried to write it like this:

# Packages That we need
from pandas_datareader import data as web
from arch import arch_model
import pandas as pd
#---------------------------------------

# grab Microsoft daily adjusted close price data from '01-03-2013' to '12-06-2018' and store it in DataFrame
df = pd.DataFrame(web.get_data_yahoo('MSFT' , start='01-03-2013' , end='12-06-2018')['Adj Close'])

#---------------------------------------

# calculate daily rate of return that is necessary for predicting daily Volatility by EGARCH
daily_rate_of_return_EGARCH = np.log(df.loc[ : '01-04-2015']/df.loc[ : '01-04-2015'].shift())
# drop NaN values
daily_rate_of_return_EGARCH = daily_rate_of_return_EGARCH.dropna()

#---------------------------------------

# Volatility Forecasting By EGARCH(1,1)
model_EGARCH = arch_model(daily_rate_of_return_EGARCH, vol='EGARCH' , p = 1 , o = 0 , q = 1)
fitted_EGARCH = model_EGARCH.fit(disp='off')

#---------------------------------------

# and finally, Forecasting step
# Note that as mentioned in `purpose` section, predict interval should be from '01-04-2015' to end of the data frame
horizon = len(df.loc['01-04-2015' : ])
volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation')

Error

and then i got this error:

MemoryError                               Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_12900/1021856026.py in <module>
      1 horizon = len(df.loc['01-04-2015':])
----> 2 volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation') 

MemoryError: Unable to allocate 3.71 GiB for an array with shape (503, 1000, 989) and data type float64

seems arch is going to save huge amount of data.

Expected Result

what i expect, is a simple pandas.Series that contains daily volatility predictions from '01-04-2015' until '12-06-2018'. precisely i mean smth like this:
(Note: date format --> mm-dd-yyyy)

    (DATE)     (VOLATILITY)
'01-04-2015'      .....
'01-05-2015'      .....
'01-06-2015'      .....
.                   .
.                   .
.                   .
'12-06-2018'      .....

How can i achieve this?

Shayan
  • 5,165
  • 4
  • 16
  • 45
  • Are you trying to create a series of 1-step ahead forecasts, or are you trying to create the series of h-step appear forecasts for h=1,2,...,+4 years? – Kevin S Nov 23 '21 at 10:16
  • @KevinS "trying to create the series of h-step appear forecasts for h=1,2,...,+4 years". Just like mentioned in **expected Result** section. I want to predict volatility by **EGARCH(1,1)** for 800 days ahead (for example!). So what i need is just 800 forecasted values of volatility and nothing else. – Shayan Nov 23 '21 at 10:38
  • 1
    This happens because you have to use simulation to forecast when the horizon is > 1 in an EGARCH model. The simulation paths are stored and returned as part of the `ARCHModelForecast` object. Producing very long-horizon forecasts via simulation is not a goal of the project. You have two options here. First, use a model that has analytical forecasts, such as GARCH. Second, you could forecast for some smaller horizon and see if the forecast is constant, and then use this value. It seems to have converged after around 20 observations. Finally, you could write custom forecast code. – Kevin S Nov 25 '21 at 09:39
  • @KevinS Thanks for your advises dear professor. As you said *"Finally, you could write custom forecast code."*, What if i use **sliding window**? I mean for example *by days 1-5 --> predict day 6* | *by days 2-6 --> predict day 7* | *by days 3-7 --> predict day 8* and *so on...* . In this way, each time i'm predicting just one step ahead forecasting! Can this be one of possible solutions for this kind of long run predictions? – Shayan Nov 25 '21 at 10:17

1 Answers1

1

You only need to pass the reindex=False keyword and the memory requirement drops dramatically. You need a recent version of the arch package to use this feature which changes the output shape of the forecast to include only the forecast values, and so the alignment is different from the historical behavior.

# Packages That we need
from pandas_datareader import data as web
from arch import arch_model
import pandas as pd
#---------------------------------------

# grab Microsoft daily adjusted close price data from '01-03-2013' to '12-06-2018' and store it in DataFrame
df = pd.DataFrame(web.get_data_yahoo('MSFT' , start='01-03-2013' , end='12-06-2018')['Adj Close'])

#---------------------------------------

# calculate daily rate of return that is necessary for predicting daily Volatility by EGARCH
daily_rate_of_return_EGARCH = np.log(df.loc[ : '01-04-2015']/df.loc[ : '01-04-2015'].shift())
# drop NaN values
daily_rate_of_return_EGARCH = daily_rate_of_return_EGARCH.dropna()

#---------------------------------------

# Volatility Forecasting By EGARCH(1,1)
model_EGARCH = arch_model(daily_rate_of_return_EGARCH, vol='EGARCH' , p = 1 , o = 0 , q = 1)
fitted_EGARCH = model_EGARCH.fit(disp='off')

#---------------------------------------

# and finally, Forecasting step
# Note that as mentioned in `purpose` section, predict interval should be from '01-04-2015' to end of the data frame
horizon = len(df.loc['01-04-2015' : ])
volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation', reindex=False)
Kevin S
  • 2,595
  • 16
  • 22
  • Thanks a lot dear professor, I executed your solution and i got this **WARNING** : *"**DataScaleWarning**: y is poorly scaled, which may affect convergence of the optimizer when estimating the model parameters. The scale of y is 0.0001942. Parameter estimation work better when this value is between 1 and 1000. The recommended rescaling is 100 * y."*. Should i be worry about the results? I know it doesn't make sense to predict volatility for over long period and the results aren't guaranteed to happen and have accurate predictions, But should i rescale data or smth? – Shayan Nov 25 '21 at 10:11