0

I am working with an input file where I have different string dates given in different month,day,year formats example input ->

input <- c("2014-08-31 23:59:38" , "9/1/2014 00:00:25","2014-08-31 13:39:23", "12/1/2014 20:03:28")

How can I use a single function that would convert various formats of dates, in a fast manner, I am processing millions of lines

so far I have written this function:

convert_date <- function(x){
  if (is.na(mdy_hms(x))){
    return(ymd_hms(x))
  }
  return(mdy_hms(x))
}

However, it is extremely slow, I am looking for a faster and more convenient method.

Thank you so much for your time.

  • Does this answer your question? [R Convert to date from multiple formats](https://stackoverflow.com/questions/43381221/r-convert-to-date-from-multiple-formats) – Andrea M Apr 20 '22 at 10:51

1 Answers1

0

If you can construct a vector of possible formats that the date could be in, you could use clock. For each date-time string, it stops on the first format that succeeds.

Note that this only works if your formats are unambiguous. i.e. it would probably give you faulty results if you had both %m/%d/%Y and %d/%m/%Y in the same vector, because those are ambiguous.

library(clock)

input <- c(
  "2014-08-31 23:59:38" , "9/1/2014 00:00:25",
  "2014-08-31 13:39:23", "12/1/2014 20:03:28"
)

format <- c("%Y-%m-%d %H:%M:%S", "%m/%d/%Y %H:%M:%S")

date_time_parse(input, zone = "UTC", format = format)
#> [1] "2014-08-31 23:59:38 UTC" "2014-09-01 00:00:25 UTC"
#> [3] "2014-08-31 13:39:23 UTC" "2014-12-01 20:03:28 UTC"
Davis Vaughan
  • 2,780
  • 9
  • 19