0

In my R data set there is a data$date variable, made of two different writting : some are dd-mmm-yy (ex. "14-nov-17") and others are ddMMMyyyy (ex. "14APR2016").

Here I'm stuck. How can I get all of them to date format ?

Thank you

2 Answers2

3

An option would be parse_date_time from lubridate which can take multiple Date formats

library(lubridate)
parse_date_time(v1, c("%d-%b-%y", "%d%b%Y"))
#[1] "2017-11-14 UTC" "2016-04-14 UTC"

Or with anydate from anytime. But, applying anydate, check whether all the formats are already present with

library(anytime)
getFormats()

If some formats are missing, add it with addFormats

addFormats("%d-%b-%y")

and then apply the anydate on the column/vector of dates

anydate(v1)
#[1] "2017-11-14" "2016-04-14"

data

v1 <- c("14-nov-17", "14APR2016")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you so much it worked perfectly ! amazing package. – CactusTown Jan 03 '19 at 08:55
  • @CactusTown Thank you for the comments. You can also check [here](https://stackoverflow.com/questions/53927770/how-to-programatically-isolate-one-line-out-of-a-series-of-lines-in-ggplot/53928384#53928384) – akrun Jan 03 '19 at 08:56
0

Another option, if you want to use base R and like regular expressions:

data$date <- as.Date(sub('(\\d{2})(\\w{3})(\\d{2})(\\d{2})', '\\1-\\2-\\4', data$date),
                     format = "%d-%b-%y")
C. Braun
  • 5,061
  • 19
  • 47
  • No, just say no. Never ever use regexps for data parsing when suitable libraries exist. – Dirk Eddelbuettel Jan 02 '19 at 16:49
  • @DirkEddelbuettel I agree that 99% of the time (this case being one of them) that it is better to use a suitable library. Sometimes though I avoid libraries that help just one calculation if the code is being shared with people using different platforms, as package management and installation creates more dependencies and chances for things to go wrong. Would you argue that it would still be better for the sake of cleaner code? – C. Braun Jan 02 '19 at 17:07