I am following this tutorial using the sweep package to perform tidy time series forecasting for groups of time series. Sweep extends the broom package to tidy forecast objects.
Tutorial here: https://rdrr.io/cran/sweep/f/vignettes/SW01_Forecasting_Time_Series_Groups.Rmd
Problem: the time series in my data contain different lengths and start dates. In the tutorial, a fixed start is passed through to tk_ts() because the each time series has the same start and end date:
monthly_qty_by_cat2_ts <- monthly_qty_by_cat2_nest %>%
mutate(data.ts = map(.x = data.tbl,
.f = tk_ts,
select = -order.month,
start = 2011, # <- see the fixed start date here
freq = 12))
Question: How do I create a list column of time series objects using map like the example above (and in the tutorial) BUT include the correct start date and end date for each series (which is different for each series)
Packages:
library(tidyquant)
library(sweep)
library(timetk)
library(forecast)
library(tidyverse)
Reproducible Sample Data:
df <- structure(list(id = c("series_1", "series_1", "series_1", "series_1",
"series_1", "series_1", "series_1", "series_1", "series_1", "series_1",
"series_1", "series_1", "series_2", "series_2", "series_2", "series_2",
"series_2", "series_2", "series_2", "series_2", "series_2", "series_2",
"series_2", "series_2", "series_2", "series_2", "series_2", "series_2",
"series_2", "series_2", "series_2", "series_2", "series_2", "series_2",
"series_2", "series_2", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3", "series_3", "series_3", "series_3", "series_3",
"series_3", "series_3"), date = structure(c(10957, 10988, 11017,
11048, 11078, 11109, 11139, 11170, 11201, 11231, 11262, 11292,
13787, 13818, 13848, 13879, 13910, 13939, 13970, 14000, 14031,
14061, 14092, 14123, 14153, 14184, 14214, 14245, 14276, 14304,
14335, 14365, 14396, 14426, 14457, 14488, 15706, 15737, 15765,
15796, 15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040,
16071, 16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314,
16344, 16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587,
16617, 16648, 16679, 16709, 16740, 16770), class = "Date"), value = c(0.526816892903298,
0.0640646643005311, 0.569032567087561, 0.733993547270074, 0.742038151714951,
0.273655793862417, 0.167404572479427, 0.766059899237007, 0.60176682821475,
0.0769246644340456, 0.162491872673854, 0.323168716160581, 0.179594057612121,
1.096650313586, 0.894524970557541, 1.55353087605909, 1.50662920810282,
1.06641945429146, 1.95049989689142, 0.226111006457359, 0.644822218455374,
0.998987099621445, 0.303691457025707, 0.782052680384368, 1.59218573896214,
0.171859007328749, 1.9222901831381, 1.4127164632082, 0.919900813139975,
1.93520273640752, 0.00968976970762014, 0.204170028213412, 1.90123205445707,
1.05964627675712, 1.40747981145978, 0.476186634972692, 1.56826665904373,
0.106335987104103, 2.7993093256373, 1.07078968570568, 0.668198951287195,
0.584522894583642, 0.753677956061438, 2.76492932089604, 2.17496411106549,
2.56561762047932, 0.586419345578179, 1.7261581714265, 1.38705582660623,
0.708714888431132, 1.91359720285982, 1.85413848585449, 1.85429209470749,
2.18856360157952, 1.00432092184201, 0.588805445702747, 2.95583719946444,
0.382465981179848, 0.711439447710291, 1.24924974096939, 0.961857272777706,
2.26519317110069, 1.10985011514276, 0.938654307508841, 0.985875837039202,
1.13028976111673, 2.90536748478189, 0.795255574397743, 1.4741945641581,
2.02167924796231, 1.2093570465222, 1.47486943169497)), .Names = c("id",
"date", "value"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-72L))
After Nesting:
df_nest <- df %>% group_by(id) %>%
nest(.key = data.tbl)
From here I would like to apply some function to mutate a new list column that contains the same data from data.tbl like in the example above (and in the tutorial) coerced to a ts object (in order to be used with the forecast package) but with the correct start and end date for each series.
I want to apply something like this:
df_ts <- df_nest %>%
mutate(data.ts = map(.x = data.tbl,
.f = tk_ts,
select = -date,
start = c(2000, 1), # <- Problem HERE
freq = 12))
But the problem is that this only gives the correct start date for series_1.
How do I mutate this new list column of ts objects with the correct start and end dates for each series?
Thanks