4

I have a variable with dates in a two different formats ("%Y-%m-%d" and "%m/%d/%Y"):

dput(df)
structure(1:8, .Label = c("2019-04-07", "2019-04-08", "2019-04-09", 
"2019-04-10", "7/29/2019", "7/30/2019", "7/31/2019", "8/1/2019"
), class = "factor")

# [1] 2019-04-07 2019-04-08 2019-04-09 2019-04-10 7/29/2019  7/30/2019  7/31/2019  8/1/2019  
# 8 Levels: 2019-04-07 2019-04-08 2019-04-09 2019-04-10 7/29/2019 7/30/2019 ... 8/1/2019

I try to parse the dates using as.Date with tryFormats

df <- as.character(df)
d <- as.Date(df, tryFormats = c("%Y-%m-%d", "%m/%d/%Y"))

which converts the first format structure, but then returns NA for the second format structure. If I run the two formats separately, they look good though:

t1 <- as.Date(df, format = "%Y-%m-%d")
t2 <- as.Date(df, format = "%m/%d/%Y")

t1
# [1] "2019-04-07" "2019-04-08" "2019-04-09" "2019-04-10" NA          
# [6] NA           NA           NA          

t2
# [1] NA           NA           NA           NA           "2019-07-29"
# [6] "2019-07-30" "2019-07-31" "2019-08-01"

Any suggestions? I've looked through other responses, but haven't found any good tryFormats examples/questions that seem to address this.

Henrik
  • 65,555
  • 14
  • 143
  • 159
NoobR
  • 311
  • 2
  • 10

3 Answers3

5

We can use anydate from anytime

library(anytime)
anydate(df)

If any of the formats are not present, use addFormats() to add that format and then apply the function


Or with lubridate

library(lubridate)
as.Date(parse_date_time(df, c("ymd", "mdy")))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Well, holy moly. I had no idea that "anytime" package existed. Worked like a charm. Thanks! Any info on the tryFormats situations above? – NoobR Dec 09 '19 at 18:15
5

tryFormats will only select one of the given formats. In your case you can convert them individually, as you have already done.

d <- as.Date(df,format="%Y-%m-%d")
d[is.na(d)]  <- as.Date(df[is.na(d)],format="%m/%d/%Y")
d
#[1] "2019-04-07" "2019-04-08" "2019-04-09" "2019-04-10" "2019-07-29"
#[6] "2019-07-30" "2019-07-31" "2019-08-01"
GKi
  • 37,245
  • 2
  • 26
  • 48
  • I've changed this to the correct answer as it more directly addresses my question. But all good answers. Thank you everyone. – NoobR Dec 09 '19 at 20:41
2

For base solution, you may try the following as explained in this answer:

> df
 #[1] "2019-04-07" "2019-04-08" "2019-04-09" "2019-04-10" "7/29/2019"  "7/30/2019"  
 #"7/31/2019"  "8/1/2019" 

fmts <- c("%Y-%m-%d","%m/%d/%Y")

as.Date(apply(outer(df, fmts, as.Date),1,na.omit),'1970-01-01')
#[1] "2019-04-07" "2019-04-08" "2019-04-09" "2019-04-10" "2019-07-29" "2019-07-30" "2019-07-31" "2019-08-01"
S K
  • 114
  • 6