0

I am using sqldf library to manipulate data frame in R. Currently, I have a data frame like this:

ID       Start_Date       End_Date
1          08-29           09-01

I want to create a new data frame using sqldf to create a range of dates between the Start_Date and the End_Date, for example, for ID1, I want the final data frame look like:

ID        Date_Range      
1           08-29   
1           08-30   
1           08-31
1           09-01

I think I can just create a new data frame. But I am wondering if it is possible to implement in sqldf?

Squirrel K
  • 25
  • 3
  • I think this is the post you are looking for - https://stackoverflow.com/questions/11494511/expand-ranges-defined-by-from-and-to-columns but it does not have `sqldf` answer. – Ronak Shah Sep 04 '20 at 01:19
  • @RonakShah It can't work with POSIXt type and report error "Error in seq.POSIXt(Start_Day, End_Day) : exactly two of 'to', 'by' and 'length.out' / 'along.with' must be specified" – Squirrel K Sep 04 '20 at 01:38
  • I see you updated your data so now you have `Start_Date`, `End_date` as characters instead of `POSIXct` ? Also are you open to other solutions or looking for only `sqldf` answer? – Ronak Shah Sep 04 '20 at 01:41
  • @RonakShah Start_Date and End_Date are POSIXct type. I tried using the method in the link you send me. But it seems to be not working. – Squirrel K Sep 04 '20 at 01:53
  • You cannot have only month and date information in `POSIXct` type. Maybe sharing the data using `dput` might help to clarify what you actually have. I have added an answer based on what you have shown. – Ronak Shah Sep 04 '20 at 01:58
  • See https://stackoverflow.com/questions/32982372/how-to-generate-all-dates-between-two-dates – G. Grothendieck Sep 04 '20 at 11:18

1 Answers1

0

Here is one way to expand the date ranges using tidyverse functions.

library(dplyr)

df %>%
  mutate(across(ends_with('Date'), as.Date, '%m-%d'), 
         #You don't need the above if columns are already of type date/POSIXct
         Date_Range = purrr::map2(Start_Date, End_Date, seq, by = '1 day')) %>%
  tidyr::unnest(Date_Range) %>%
  mutate(Date_Range = format(Date_Range, '%m-%d')) %>%
  select(-Start_Date, -End_Date)

#     ID Date_Range
#  <int> <chr>     
#1     1 08-29     
#2     1 08-30     
#3     1 08-31     
#4     1 09-01     

data

df <- structure(list(ID = 1L, Start_Date = "08-29", End_Date = "09-01"), 
      class = "data.frame", row.names = c(NA, -1L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213