0

I've got a time series of half-hourly observations for about 100 days like so:

> df
# A tibble: 4,704 x 3
    city            datetime orders
   <chr>              <time>  <dbl>
1   Wien 2016-05-12 00:00:00      1
2   Wien 2016-05-12 00:30:00      4
3   Wien 2016-05-12 01:00:00      2
4   Wien 2016-05-12 01:30:00      0
5   Wien 2016-05-12 02:00:00      5
6   Wien 2016-05-12 02:30:00      10
7   Wien 2016-05-12 03:00:00      11
8   Wien 2016-05-12 03:30:00      22
9   Wien 2016-05-12 04:00:00      4
10  Wien 2016-05-12 04:30:00      2
# ... with 4,694 more rows

I would like to do rolling forecasts on this time series – estimate a model on the first n days worth of data, then predict the n+1st. This is easy using for-loops but I thought I'd give doing this the tidy way a try. So I would like to create a data_frame that has an end-date as the first column and a data_frame that contains all the data from df up until the end-date in the second that I can then iterate over using purrr::map() and friends. How do I create this nested data_frame?

RoyalTS
  • 9,545
  • 12
  • 60
  • 101
  • Sorry, what is the question/problem? Maybe provide [reproducible example?](http://stackoverflow.com/questions/5963269) and expected output. – zx8754 Aug 19 '16 at 08:49
  • How do I create this nested data_frame? – RoyalTS Aug 19 '16 at 08:50
  • You could try using `tibble::tibble` to create the data frame. It makes it easy to create or add list-columns, I think they are the nested data frame you're looking for. – RobertMyles Mar 27 '17 at 20:46

1 Answers1

1
df <- read.table(text="city            datetime orders
Wien '2016-05-12 00:00:00'      1
Wien '2016-05-12 01:00:00'      2
Wien '2016-05-12 02:00:00'      5
Wien '2016-05-12 03:00:00'      11
Wien '2016-05-12 03:30:00'      22
Wien '2016-05-12 04:00:00'      4
Wien '2016-05-12 04:30:00'      2",header=T,stringsAsFactors=F)

df2 <- read.table(text="end
'2016-05-12 00:30:00'
'2016-05-12 01:30:00'
'2016-05-12 02:30:00'",header=T,stringsAsFactors=F)

df2 <- df2 %>% mutate(df = map(end,~df %>% dplyr::filter(datetime <.x)))
str(df2)
# 'data.frame': 3 obs. of  2 variables:
# $ end: chr  "2016-05-12 00:30:00" "2016-05-12 01:30:00" "2016-05-12 02:30:00"
# $ df :List of 3
# ..$ :'data.frame':    1 obs. of  3 variables:
#   .. ..$ city    : chr "Wien"
# .. ..$ datetime: chr "2016-05-12 00:00:00"
# .. ..$ orders  : int 1
# ..$ :'data.frame':    3 obs. of  3 variables:
#   .. ..$ city    : chr  "Wien" "Wien" "Wien"
# .. ..$ datetime: chr  "2016-05-12 00:00:00" "2016-05-12 00:30:00" "2016-05-12 01:00:00"
# .. ..$ orders  : int  1 4 2
# ..$ :'data.frame':    5 obs. of  3 variables:
#   .. ..$ city    : chr  "Wien" "Wien" "Wien" "Wien" ...
# .. ..$ datetime: chr  "2016-05-12 00:00:00" "2016-05-12 00:30:00" "2016-05-12 01:00:00" "2016-05-12 01:30:00" ...
# .. ..$ orders  : int  1 4 2 0 5
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167