0

I have a panel data set that has a column with date entries, though they are in class "character", with some as mm/dd/yyyy, and others with dd-mm-yyy. I want to format these into a Date vector, so that I can subset the data according to a cutoff date. However, as.date does not work, since the formatting of the entries varies.

df$OPdate <- as.Date(OPdate, format = "%Y-%M-%D")
dfnew = subset(df,OPdate < "2021/3/29")
df_age14 = subset(dfnew, age > 13)
list14 = unique(df_age14$postID)
finaldf = subset(df, postID %in% list14)

This is the code I am trying to run once the dates are formatted correctly. Any suggestions? Thanks in advance

EmBe
  • 1
  • It will help people answer if you can provide an example of data with the two formats as you have them. That reduces redundant work and reduces potential misunderstandings. e.g. could be as simple as `mydates <- c("2021/3/29", "11/21/2021")` – Jon Spring Jan 25 '22 at 16:28
  • 1
    Does this help: https://stackoverflow.com/a/70304571/3358272. Specifically bullet 2 with "candidate" formats. – r2evans Jan 25 '22 at 16:32

2 Answers2

0

If you're sure you only have the two formats you can try as.Date with tryFormats in sapply because tryFormats is not vectorized. strftime returns the desired character string.
Using toy data.

dates
[1] "02/23/2021" "11/03/2021" "22-03-2021" "23-04-2020" "29-06-2021"

sapply(dates, function(x) 
  strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))
  02/23/2021   11/03/2021   22-03-2021   23-04-2020   29-06-2021 
"2021-02-23" "2021-11-03" "2021-03-22" "2020-04-23" "2021-06-29" 

Working with the data

dates_new <- sapply(dates, function(x) 
  strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))

dates_new > "2021-04-14"
02/23/2021 11/03/2021 22-03-2021 23-04-2020 29-06-2021 
     FALSE       TRUE      FALSE      FALSE       TRUE

# or
as.Date(dates_new) - 23
[1] "2021-01-31" "2021-10-11" "2021-02-27" "2020-03-31" "2021-06-06"
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
0

you can use the package "lubridate" from tidyverse to change the format of date in a specific one: i.e. mdy("4/1/17") will output "2017-04-01", dmy("14/10/2021") will output "2021-10-14".

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 26 '22 at 21:37