I'm trying to create a function that will automatically determine the date format of a column in a dataframe and apply the correct as.Date() function. Typically, the dates come in "%Y-%m-%d" or "%m/%d/%y" (this will change based on if the .csv has been opened and saved in Excel).
Initially, I thought an "if/else" statement would work, and came up with the following:
if(nchar(df$date[[1]] == 10)){
df$Date <- as.Date(df$Date)
} else {
df$Date <- as.Date(df$Date, format = "%m/%d/%y"
But it throws a "character string is not in a standard unambiguous format" error.
Here's a sample data frame to work with:
a <- seq(1:10)
dates1 <- c("3/21/16", "3/22/16", "3/23/16", "3/24/16", "3/25/16", "3/26/16", "3/27/16", "3/28/16", "3/29/16", "3/30/16")
dates2 <- c("2016-03-21", "2016-03-22", "2016-03-23", "2016-03-24", "2016-03-25", "2016-03-26", "2016-03-27", "2016-03-28", "2016-03-29", "2016-03-30")
df <- data.frame(a, dates1, dates2)
df$dates1 <- as.character(df$dates1)
df$dates2 <- as.character(df$dates2)
The if/else statement should be able to work on "dates1" and "dates2", but as you can see, it only works with "dates2"
if(nchar(df$dates1[[1]] == 10)){
df$dates1 <- as.Date(df$dates1)
} else {
df$dates1 <- as.Date(df$dates1, format = "%m/%d/%y")
}
if(nchar(df$dates2[[1]] == 10)){
df$dates2 <- as.Date(df$dates2)
} else {
df$dates2 <- as.Date(df$dates2, format = "%m/%d/%y")
}
Apologies in advance for any formatting issues.