1

***NOTE: please do not link a similar post. I have found several other similar postings, but their responses have not resolved the errors I get: "Error in seq.int(0, to0 - from, by) : 'to' must be a finite number" or "from must be of length 1".... I'm looking to understand why these error statements occur and how to prevent them from occuring... Thanks!

I have a data frame like the following

    id  startdate  enddate
    1   01/01/2011 01/05/2011
    1   02/03/2012 02/05/2012
    2   03/04/2013 03/06/2013
    3   04/06/2014 04/09/2014

I want to transform the data frame so as to create the following:

    id   date
    1    01/01/2011
    1    01/02/2011
    1    01/03/2011
    1    01/04/2011
    1    01/05/2011
    1    02/03/2012
    1    02/04/2012
    1    02/05/2012
    2    03/04/2013
    2    03/05/2013
    2    03/06/2013 

.... and so on to fill in the sequence of dates between startdate and enddate

I have tried the following....

one<-as.data.table(one)
one[, startdate:=as.character(startdate)]
one[, enddate:=as.character(enddate)]
one[, startdate:=as.Date(startdate, format="%m/%d/%Y")]
one[, enddate:=as.Date(enddate, format="%m/%d/%Y")]
one<-as.data.frame(one)
one%>%
  rowwise() %>%
  do(data.frame(id=.$id, date=seq(.$startdate,.$enddate,by="day")))

When I run this, I get the following error: Error in seq.int(0, to0 - from, by) : 'to' must be a finite number

Why is this? And how can I fix this piece of code?

bziggy
  • 463
  • 5
  • 19
  • Your example doesn't show such a condition – akrun Feb 10 '20 at 23:54
  • Can you also check whether the dates format is same or if there are multiple formats. i.e. `seq(NA, Sys.Date(), by = '1 day')# Error in seq.default(NA, Sys.Date(), by = "1 day") : 'from' must be a finite number` – akrun Feb 10 '20 at 23:57
  • Can you check whether the `anydate` update shows the same error – akrun Feb 11 '20 at 00:03
  • @akrun I just tried the anydate update and received the same error statement...thanks for all of your help and advice!! – bziggy Feb 11 '20 at 00:08
  • If you can share a link to the data or a small example that shows the error, would dbe helpful – akrun Feb 11 '20 at 00:08
  • @akrun I ended up removing the observations where the start date and the end date were the same and your original solutions worked. Thanks!! – bziggy Feb 11 '20 at 00:12
  • Did you try this? https://stackoverflow.com/questions/11494511/expand-ranges-defined-by-from-and-to-columns Isn't it similar? Does it give you what you want after you change the dates to standard format? – Ronak Shah Feb 11 '20 at 02:49

1 Answers1

1

With data.table, we can use Map. Convert the 'startdate' 'enddate' to Date class, use Map to get the sequence of corresponding elements, replicate the 'id' based on the lengths of the list output of dates, concatenate the list of dates to create the two column output

library(data.table)
one[, {lst1 <- Map(seq, as.IDate(startdate, "%m/%d/%Y"),
                       as.IDate(enddate, "%m/%d/%Y"),
                      MoreArgs = list(by = "day"))
         .(id = rep(id, lengths(lst1)), date = do.call(c, lst1))}]
#    id       date
# 1:  1 2011-01-01
# 2:  1 2011-01-02
# 3:  1 2011-01-03
# 4:  1 2011-01-04
# 5:  1 2011-01-05
# 6:  1 2012-02-03
# 7:  1 2012-02-04
# 8:  1 2012-02-05
# 9:  2 2013-03-04
#10:  2 2013-03-05
#11:  2 2013-03-06
#12:  3 2014-04-06
#13:  3 2014-04-07
#14:  3 2014-04-08
#15:  3 2014-04-09

If there are multiple formats in 'date' columns, one option is anydate from anytime to automatically convert some of the formats to Date class

library(anytime)
one[, {lst1 <- Map(seq, anydate(startdate),
                    anydate(enddate),
                   MoreArgs = list(by = "day"))
      .(id = rep(id, lengths(lst1)), date = do.call(c, lst1))}]

Or using tidyverse

library(dplyr)
library(purrr)
library(tidyr)
library(lubridate)
one %>% 
   transmute(id, date = map2(mdy(startdate), mdy(enddate), seq, by = 'day')) %>% 
   unnest(c(date))
# A tibble: 15 x 2
#     id date      
#   <int> <date>    
# 1     1 2011-01-01
# 2     1 2011-01-02
# 3     1 2011-01-03
# 4     1 2011-01-04
# 5     1 2011-01-05
# 6     1 2012-02-03
# 7     1 2012-02-04
# 8     1 2012-02-05
# 9     2 2013-03-04
#10     2 2013-03-05
#11     2 2013-03-06
#12     3 2014-04-06
#13     3 2014-04-07
#14     3 2014-04-08
#15     3 2014-04-09

data

one <- structure(list(id = c(1L, 1L, 2L, 3L), startdate = c("01/01/2011", 
"02/03/2012", "03/04/2013", "04/06/2014"), enddate = c("01/05/2011", 
"02/05/2012", "03/06/2013", "04/09/2014")), class = "data.frame", row.names = c(NA, 
-4L))

setDT(one)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I'm still getting the same error, of needing from to be a finite number – bziggy Feb 10 '20 at 23:50
  • @boodaloo1 I showed a reprodducible example with the `data`. Couldn't reproduce the issue – akrun Feb 10 '20 at 23:50
  • @boodaloo1 May be you have some 'enddate' less than the `startdate` in the original dataset – akrun Feb 10 '20 at 23:52
  • I only posted a few lines of my frame as it has a few thousand observations to it. I will check to see if the enddate is less than the startdate in any of the scenarios...thanks for the help! – bziggy Feb 10 '20 at 23:57
  • @boodaloo1 that error wouldd be a different one. I am thinking that the sspecific error is because some of the elements are iin a. different format, so when `as.Date` is applied, it returns NA and this created the `finite` error – akrun Feb 10 '20 at 23:58
  • they seem to be in the same format.... I did just find a few instances where startdate and enddate are the same. Could this create an issue? – bziggy Feb 11 '20 at 00:04
  • It wouldn't be an issue `seq(Sys.Date(), Sys.Date(), by = '1 day')# [1] "2020-02-10"` – akrun Feb 11 '20 at 00:05
  • @boodaloo1 Can you show a link to the data which shows the error – akrun Feb 11 '20 at 00:06