0

I have a source column in a dataframe, where dates may be either in "dd.mm.yyyy" format or in Excel format of 5-digit number. Hence, I would like to check with ifelse, how the element looks like with str_detect and then use appropriate conversion for each.

df$date <- ifelse(str_detect(df$date, "[0-9]{2}.[0-9]{2}.[0-9]{4}") == TRUE, 
                      as.Date(df$date, format = "%d.%m.%Y"),
                      as.Date(as.numeric(df$date), origin = "1899-12-30"))

While both conversion functions work as intended on their own, when I put them into ifelse statement, I got weird results - basically 1st Jan 2019 becomes "17897". Can somebody explain why is that happening and how I can make it work? Thanks

Edit: code snippet

  df <- c("01.01.2019", "43867")
  df <- ifelse(str_detect(df, "[0-9]{2}.[0-9]{2}.[0-9]{4}") == TRUE,
                      as.Date(df, format = "%d.%m.%Y"),
                      as.Date(as.numeric(df), origin = "1899-12-30"))

Desired output: "2019-01-01" "2020-02-06" Resulted output 17897 18298 Where if I apply first (yes) function without ifelse, I will get "2019-01-01" NA, and no function results in NA "2020-02-06"

user1039698
  • 143
  • 1
  • 6
  • Interesting problem, it'd be good to have a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with a snippet of the data to test solutions against. – anddt Feb 13 '20 at 09:37
  • could you give examples and expected output – Onyambu Feb 13 '20 at 09:37
  • Read the help page of `ifelse`, particularly the Warning. – Edward Feb 13 '20 at 09:40
  • we need an example, however if the variables as stored as date objects I do not understand why you cant coerce them to numerics and then you can play with timestamps. if you use dput() you can show us more precisely what the problem is – Dimitrios Zacharatos Feb 13 '20 at 09:45
  • added code snippet – user1039698 Feb 13 '20 at 09:52

2 Answers2

0

You could convert the data to numeric, dates which are not numbers would be changed to NA (with a warning which is safe to ignore), we can then use if_else to change them to Date based on it.

df <- c("01.01.2019", "43867")
df1 <- as.numeric(df)
dplyr::if_else(is.na(df1), as.Date(df, format = "%d.%m.%Y"),
                as.Date(df1, origin = "1899-12-30"))
#[1] "2019-01-01" "2020-02-06"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Just change the class, as stated in the help page of ifelse. No need to load other packages.

> class(df) <- "Date"
> df
[1] "2019-01-01" "2020-02-06"
Edward
  • 10,360
  • 2
  • 11
  • 26