Future dataset is incomplete when using Fable Prophet

Question

I'm trying to view the out of sample performance scores after running fable prophet. Please note, the forecast is grouped based on type and the forecast is looking 5 observations ahead.

Here is the code:

library(tibble)
library(tsibble)
library(fable.prophet)

lax_passengers <- read.csv("https://raw.githubusercontent.com/mitchelloharawild/fable.prophet/master/data-raw/lax_passengers.csv")


library(dplyr)
library(lubridate)
lax_passengers <- lax_passengers %>%
  mutate(datetime = mdy_hms(ReportPeriod)) %>%
  group_by(month = yearmonth(datetime), type = Domestic_International) %>%
  summarise(passengers = sum(Passenger_Count)) %>%
  ungroup()

lax_passengers <- as_tsibble(lax_passengers, index = month, key = type)
fit <- lax_passengers %>% 
  model(
    mdl = prophet(passengers ~ growth("linear") + season("year", type = "multiplicative")),
  )
fit

test_tr <- lax_passengers %>%
  slice(1:(n()-5)) %>%
  stretch_tsibble(.init = 12, .step = 1)


fc <- test_tr %>%
  model(
    mdl = prophet(passengers ~ growth("linear") + season("year", type = "multiplicative")),
  ) %>%
  forecast(h = 5)


fc %>% accuracy(lax_passengers)

When I run fc %>% accuracy(lax_passenger), I get the following warning:

Warning message:
The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
5 observations are missing between 2019 Apr and 2019 Aug

How do make the future dataset complete as I believe the performance score isn't accurate based on the missing 5 observations.

It seems like when I try to stretch the tsibble, it doesn't slice correctly as it doesn't remove the last 5 observations from each type.

score 1 · Accepted Answer · answered Nov 29 '22 at 22:33

1

The slice() function removes rows from the entire dataset, so it is only removing the last 5 rows from your last key (type=="International"). To remove the last 5 rows from all keys, you'll need to group by keys and slice.

test_tr <- lax_passengers %>%
  group_by_key() %>% 
  slice(1:(n()-5)) %>%
  ungroup() %>% 
  stretch_tsibble(.init = 12, .step = 1)

answered Nov 29 '22 at 22:33

Mitchell O'Hara-Wild

2,174
5
9

This is exactly what I was looking for I wasn't a ware that you could group by keys. Thank you! – QMan5 Nov 30 '22 at 16:48
1

`group_by_key()` is just a shortcut for `group_by(type)` here. But yes, most operations in dplyr won't default to being applied to each key unless it is grouped. – Mitchell O'Hara-Wild Nov 30 '22 at 21:58

Future dataset is incomplete when using Fable Prophet

1 Answers1