1

Just looking for help working with some dates in R. Code for a simple data frame is below, with one column of start dates and one column of end dates. I would like to create a new column with the difference in days between each set of dates - start date and end date. Also, the dates are in different formats, so is there an easy way to convert all dates to a similar format? I've been reading about the lubridate package but haven't found anything yet on this particular situation that is easy for me to quickly learn as an R newbie. It would be great to link the answer to the dplyr pipeline as well, if possible, to calculate average number of days, etc.

Start.date<-c("05-May-15", "10-June-15", "July-12-2015")


End.date<-c("12-July-15", "2015-Aug-15", "Sept-12-2015")


Dates.df<-data.frame(Start.date,End.date)
Uwe
  • 41,420
  • 11
  • 90
  • 134
Mike
  • 2,017
  • 6
  • 26
  • 53
  • Duplicate question: http://stackoverflow.com/questions/11666172/calculating-number-of-days-between-2-columns-of-dates-in-data-frame – InfiniteFlash Apr 17 '16 at 04:17
  • Does that link answer your question btw? – InfiniteFlash Apr 17 '16 at 04:18
  • @InfiniteFlashChess It may appear as duplicate but it is not exactly duplicate. In this question, the End.date vector may have different date formats. – Kunal Puri Apr 17 '16 at 04:18
  • And start date can also have different date formats. – Kunal Puri Apr 17 '16 at 04:19
  • 2
    For the "12-July-15", it can be interpreted as either 2015 or 2012. – akrun Apr 17 '16 at 04:19
  • Oh okay, thank you for the clarification. – InfiniteFlash Apr 17 '16 at 04:20
  • 1
    I would say `library(lubridate);Reduce("-",lapply(Dates.df, function(x) {x1 <-sub("(.*)(\\b[A-Za-z]{3}).(-.*)", "\\1\\2\\3", x);parse_date_time(x1,guess_formats(x1, c("dbY","dby", "bdY", "Ybd")))}))` would work in most cases except the one with ambiguous format i.e `12-July-15`. For those cases, it is better to manually change the format before attempting this. – akrun Apr 17 '16 at 04:27
  • The date 'style' is either european or american and likely not mixed @akrun so Mike should be able use your suggested code without the manual changes. – Chris Apr 17 '16 at 04:36
  • @Chris The `12-July-15` is parsed as `"0012-07-15 UTC"` based on the code I showed – akrun Apr 17 '16 at 04:39

0 Answers0