1

I could not find an answer to my following question via the search function:

Why does the given ifelse() condition not work the way I intend?

I got a dataset that wrongly had an open-text field for a date and so I got a variety of ways people filled the date in. By now I really got close to something useable but my intended next step of making every mm/yy entry that was before the year 2000 a mm/19yy entry via the ifelse function does not give me a correct result:

Dates <- c("10/19", "04/2019", "O5/1992", "03/92")

ifelse(str_length(Dates)==5 & str_sub(Dates,4,5)>20, stri_sub(Dates, 4, 3) <- 19, Dates)

The result looks like this:

[1] "10/1919"   "04/192019" "O5/191992" "19"   

While I would want it to look like this:

1] "10/19"   "04/2019" "O5/1992" "03/1992"

Any help is highy appreciated!

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214

2 Answers2

3

This does not give the expected output you have shown but I think it is better to turn the dates into standard dates so that it easier to use them.

Dates <- c("10/19", "04/2019", "O5/1992", "03/92")
new_Date <- as.Date(lubridate::parse_date_time(paste0('1/', Dates), c('dmY', 'dmy')))
new_Date
#[1] "2019-10-01" "2019-04-01" "1992-05-01" "1992-03-01"

Then you can format these dates the way you want :

format(new_Date, '%Y-%m')
#[1] "2019-10" "2019-04" "1992-05" "1992-03"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

Rather than do this in a single expression, I recommend splitting it apart for readability:

parts = str_split(dates, '/')
year = as.integer(map_chr(parts, `[[`, 2L))
months = as.integer(map_chr(parts, `[[`, 1L))
result = ifelse(
    str_length(dates) == 5L & year > 20 & year < 100,
    paste0(months, '/', '19', as.character(year)),
    dates
)

This code also handles data type conversions explicitly, which makes the code more expressive and helps finding errors — for instance, your third date accidentally uses O (capital o) instead of 0, which I only noticed because my code complains about the invalid conversion.

Fundamentally I also agree with Ronak’s answer: the output you seem to want is inconsistent and should generally be avoided in favour of a uniform format, which incidentally leads to much simpler code, as Ronak’s answer shows.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214