3

I'm working on a forecasting model, where I have monthly data from 2014 to current month (March 2018).

Part of my data are a column for billings and a column for quote amounts, e.g. (Apologies for the formatting)

Year - Quarter - Month - BILLINGS - QUOTES
2014- 2014Q1-- 201401- 100-------------500
2014- 2014Q1-- 201402- 150-------------600
2014- 2014Q1-- 201403- 200-------------700

I'm using this to predict monthly sales, and attempting to use xreg with the number of quotes monthly.

I reviewed the article below, but am missing something to accomplish what I'm trying to do: ARIMA forecasting with auto.Arima() and xreg

Question: Can somebody show an example of forecasting OUT OF SAMPLE using xreg? I understand that in order to accomplish this, you need to forecast your xreg variables out of sample, but I cannot figure out how to pass those future values in.

I tried using something like futurevalues$mean after predicting the values, but this did not work.

Here is my code:

sales = read.csv('sales.csv')

# Below, I'm creating a training set for the models through 
#  December 2017 (48 months).
train = sales[sales$TRX_MON<=201712,]

# I will also create a test set for our data from January 2018 (3 months)
test = sales[sales$TRX_MON>201712,]

dtstr2 <- ts(train2, start=2014, frequency=12)
dtste2 <- ts(test2, start=2018, frequency=12)

fit2 <- auto.arima(dtstr2[,"BILLINGS"], xreg=dtstr2[,"QUOTES"])
fcast2 <- forecast(fit2, xreg=dtste2[,"QUOTES"], h=24)
fcast2

The code above works, but only gives mea 3 month forecast, e.g.

                  Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Jan 2018          70                60       100      50       130
Feb 2018          80                70       110      60       140
Mar 2018          90                80       120      70       150

I have scoured as many blogs and topics I could find seeking an example of using auto.arima with an out of sample forecast of an xreg variable, and cannot find any that have done this.

Can anybody help?

Thank you much.

Dana Hagist
  • 33
  • 1
  • 5
  • You will only get as many forecasts as you provide covariates for. So here you provide 3 and get 3. If you want more you have to provide a matrix of x values with as many rows as you want predictions. – atiretoo Mar 29 '18 at 17:45
  • Hi @atiretoo, thank you for the reply. In this case, would I have to manually create a matrix of x values to push in, or do you know of a way to push forecasted values in. For example, if I forecast my xreg variable(s), I will get a similar output including a point forecast and confidence intervals. Can I push my point forecast into the model rather than manually creating a matrix? Thanks again. – Dana Hagist Mar 29 '18 at 17:55
  • Well, I tried making up an MWE from data posted in the linked question, but I'm running into problems that might have to do with that data and not your problem. Without your data to hand I might be solving a non-problem. – atiretoo Mar 29 '18 at 19:15
  • I figured out my problem ... – atiretoo Mar 29 '18 at 19:29
  • Thank you @atiretoo... let me give this a shot and I'll let you know how it works. I'm using actual company data for the analysis which is why I couldn't post it. If I can't get this approach to work, I'll spin up some mock data to use. – Dana Hagist Mar 29 '18 at 19:47

2 Answers2

3

Here is an MWE for out of sample prediction of time series with unknown covariates. This relies on the data provided for this question as well as @Raad 's excellent answer.

library("forecast")

dta = read.csv("~/stackexchange/data/xdata.csv")[1:96,]
dta <- ts(dta, start = 1)

# to illustrate out of sample forecasting with covariates lets split the data
train <- window(dta, end = 90)
test <- window(dta, start = 91)

# fit model
covariates <- c("Customers", "Open", "Promo")
fit <- auto.arima(train[,"Sales"], xreg = train[, covariates])

forecast from test data

fcast <- forecast(fit, xreg = test[, covariates])

But what if we do not know the values of Customers yet? The desired goal is to forecast Customers and then use those forecast values in the forecast of Sales. Open and Promo are under the control of the manager, so will be "fixed" in the forecast.

customerfit <- auto.arima(train[,"Customers"], xreg = train[, c("Open","Promo")])

I will attempt to forecast 2 weeks out, and assume there is no promotion.

newdata <- data.frame(Open = rep(c(1,1,1,1,1,1,0), times = 2),
                          Promo = 0)

customer_fcast <- forecast(customerfit, xreg = newdata)

# the values of customer are in `customer_fcast$mean`

newdata$Customers <- as.vector(customer_fcast$mean)

It is critical to get newdata columns in same order as original data! forecast() matches regressors by position

sales_fcast <- forecast(fit, xreg = as.matrix(newdata)[,c(3,1,2)])
plot(sales_fcast)

Created on 2018-03-29 by the reprex package (v0.2.0).

atiretoo
  • 1,812
  • 19
  • 33
  • I've been able to use this approach to forecast out to future periods. Thank you very much for the help. Last question on this, are we able to represent these future periods as actual time references? When I was not using xreg, my forecasted values would represent the future of the time series, and now I'm simply seeing (41,42,43,etc.)? – Dana Hagist Mar 29 '18 at 21:24
  • Yes, the frequency and units will be the same as the time series in the original fit. – atiretoo Mar 29 '18 at 23:50
  • 1
    I used this approach for an interview, looks great. I'll buy ya many coffees if I get the job mate. Hadn't touched TS in a hot minute – ctde Dec 11 '22 at 06:18
0

Thank you again for assisting with this.

I was able to use a combination of the above advice to get what I was looking for.

Ultimately, what I ended up doing is creating time series objects for my exogenous variables and forecasting those. Then, I took the predict$mean outputs and created time series objects for those (of whatever length I wanted to forecast my original variable) and then feeding those into my original forecasting model.

Dana Hagist
  • 33
  • 1
  • 5