I have a dataframe with date ranges in it, from which I'd like to create new rows representing each year encompassed by that range (including starting and ending year). It looks like this:
id start end
1 2000 2004
2 2005 2005
3 2005 2007
4 2001 2002
Where 'id' is a factor, 'start' and 'end' are dates.
But I need to expand the dataframe to look like this:
id year
1 2000
1 2001
1 2002
1 2003
1 2004
2 2005
3 2005
3 2006
3 2007
4 2001
4 2002
I've tried the approaches suggested here: Expand rows by date range using start and end date and here Generate rows between two dates in a dataframe. Specifically I ran:
library(data.table)
setDT(df)[, .(year = seq.Date(start, end, by = '1 year')), by = 'id']
And also tried the dplyr approach:
library(dplyr)
library(purrr)
df_expanded <- df %>%
transmute(id, year = map2(start, end, seq, by = "year")) %>%
unnest %>%
distinct
Both attempts resulted in a similar error:
Error in seq.int(r1$year, to0$year, by) : wrong sign in 'by' argument
I have looked but I can't figure out why I am getting this error. I should mention that this error also happens with the full-dates in format YYYY-MM-DD. I'm not interested in the monthly or daily differences so I reformatted these to be YYYY only, but this code is still returning the error message.
Can anyone please help?