Using unnest
on a list of dates was an issue in previous versions of tidyr
. I got this same error and found a workaround, but then no longer needed the workaround once I updated to tidyr
0.8.1. It's an issue that's documented in a few issues on GitHub—#407 and #450 were ones I looked at.
If you have a version that can't unnest dates, you can build on @hpesoj626's answer by converting the dates to strings, unnesting, then converting the strings back to dates.
library(tidyverse)
wide <- tibble(
id = c(1, 1, 2, 2, 2),
HomeNum = c(0,1,0,1,2),
StartDate = as.Date(c("2001-01-01", "2001-03-01", "2000-04-01", "2001-02-01", "2002-08-01")),
FinishDate = as.Date(c("2001-02-01", "2002-05-01", "2001-01-01", "2002-07-01", "2002-12-01"))
)
# with previous versions of tidyr
wide %>%
group_by(id, HomeNum) %>%
mutate(date = list(seq.Date(StartDate, FinishDate, by = "month") %>% as.character())) %>%
tidyr::unnest() %>%
mutate(date = as.Date(date))
#> # A tibble: 50 x 5
#> # Groups: id, HomeNum [5]
#> id HomeNum StartDate FinishDate date
#> <dbl> <dbl> <date> <date> <date>
#> 1 1 0 2001-01-01 2001-02-01 2001-01-01
#> 2 1 0 2001-01-01 2001-02-01 2001-02-01
#> 3 1 1 2001-03-01 2002-05-01 2001-03-01
#> 4 1 1 2001-03-01 2002-05-01 2001-04-01
#> 5 1 1 2001-03-01 2002-05-01 2001-05-01
#> 6 1 1 2001-03-01 2002-05-01 2001-06-01
#> 7 1 1 2001-03-01 2002-05-01 2001-07-01
#> 8 1 1 2001-03-01 2002-05-01 2001-08-01
#> 9 1 1 2001-03-01 2002-05-01 2001-09-01
#> 10 1 1 2001-03-01 2002-05-01 2001-10-01
#> # ... with 40 more rows
Otherwise, a solution like the one they posted should work:
# with tidyr 0.8.1
wide %>%
group_by(id, HomeNum) %>%
mutate(date = list(seq.Date(StartDate, FinishDate, by = "month"))) %>%
tidyr::unnest()
#> # A tibble: 50 x 5
#> # Groups: id, HomeNum [5]
#> id HomeNum StartDate FinishDate date
#> <dbl> <dbl> <date> <date> <date>
#> 1 1 0 2001-01-01 2001-02-01 2001-01-01
#> 2 1 0 2001-01-01 2001-02-01 2001-02-01
#> 3 1 1 2001-03-01 2002-05-01 2001-03-01
#> 4 1 1 2001-03-01 2002-05-01 2001-04-01
#> 5 1 1 2001-03-01 2002-05-01 2001-05-01
#> 6 1 1 2001-03-01 2002-05-01 2001-06-01
#> 7 1 1 2001-03-01 2002-05-01 2001-07-01
#> 8 1 1 2001-03-01 2002-05-01 2001-08-01
#> 9 1 1 2001-03-01 2002-05-01 2001-09-01
#> 10 1 1 2001-03-01 2002-05-01 2001-10-01
#> # ... with 40 more rows
Another option is to gather
the data into a long format, where observations have a type
column showing whether it's the start or finish date. Then use complete
to fill in missing dates between each group's minimum and maximum dates. Gathering keeps the type
column, which gets filled in as NA
for the dates that are added. You could then drop the type
column if it's no longer useful.
wide %>%
gather(key = type, value = date, StartDate, FinishDate) %>%
group_by(id, HomeNum) %>%
complete(date = seq.Date(min(date), max(date), by = "month"))
#> # A tibble: 50 x 4
#> # Groups: id, HomeNum [5]
#> id HomeNum date type
#> <dbl> <dbl> <date> <chr>
#> 1 1 0 2001-01-01 StartDate
#> 2 1 0 2001-02-01 FinishDate
#> 3 1 1 2001-03-01 StartDate
#> 4 1 1 2001-04-01 <NA>
#> 5 1 1 2001-05-01 <NA>
#> 6 1 1 2001-06-01 <NA>
#> 7 1 1 2001-07-01 <NA>
#> 8 1 1 2001-08-01 <NA>
#> 9 1 1 2001-09-01 <NA>
#> 10 1 1 2001-10-01 <NA>
#> # ... with 40 more rows
Created on 2018-05-22 by the reprex package (v0.2.0).