2

I am trying to convert a string of date and time in R using anytime() function. The string have the format 'date-month-year hour-minute-second'. It seems that the anytime() function does not work for a few cases, as shown in the example below.

library(lubridate)
x1<-"03-01-2019 01:00:00"
x2<-"23-01-2019 17:00:00"

anytime(x1)
[1] "2019-03-01 01:00:00 CET"
anytime(x2)
[1] NA

I am trying to figure out how to get rid of this problem. Thanks for your help :)

  • the problem is with the day component of the date, which is in a format that is not recognized by the default `anytime()` parser – Hack-R Feb 18 '23 at 22:39
  • I see. How to change the format to the one that could be recognised by `anytime()`. I have a column in a CSV file in this format :( – Pratik Mullick Feb 18 '23 at 22:40
  • Note that your first date, per you a day-month-year, is parsed as month-day-year. That is the danger with non-ISO8601 formats: `anytime` tries to help by _heuristically_ picking day-month or month-day and the choice of separator matters. But as documented you can add a preferred format to set of formats `anytime` will try. See the help page. – Dirk Eddelbuettel Feb 18 '23 at 22:46
  • Why not using built-in `strptime("23-01-2019 17:00:00", '%d-%m-%Y %T')`. – jay.sf Feb 19 '23 at 07:13

2 Answers2

3

You can use addFormats() to add a specific format to the stack known and used by anytime() (and/or anydate())

> library(anytime)
> addFormats("%d-%m-%Y %H:%M:%S")
> anytime(c("03-01-2019 01:00:00", "23-01-2019 17:00:00"))
[1] "2019-01-03 01:00:00 CST" "2019-01-23 17:00:00 CST"
> 

As explained a few times before on this site, the xx-yy notation is ambiguous and interpreted differently in different parts of the world.

So anytime is guided by use as the separator: / is more common in North America so we use "mm/dd/yyy". On the other hand a hyphen is more common in Europe so the "dd-mm-yyyy" starts that way. You can use getFormats() to see the formats in anytime(). In a fresh session:

> head(getFormats(), 12)   ## abbreviated for display here
 [1] "%Y-%m-%d %H:%M:%S%f" "%Y-%m-%e %H:%M:%S%f" "%Y-%m-%d %H%M%S%f"  
 [4] "%Y-%m-%e %H%M%S%f"   "%Y/%m/%d %H:%M:%S%f" "%Y/%m/%e %H:%M:%S%f"
 [7] "%Y%m%d %H%M%S%f"     "%Y%m%d %H:%M:%S%f"   "%m/%d/%Y %H:%M:%S%f"
[10] "%m/%e/%Y %H:%M:%S%f" "%m-%d-%Y %H:%M:%S%f" "%m-%e-%Y %H:%M:%S%f"
> 

You can use it without the head() to see all.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
2

An alternative approach could be using parsedate package:

library(parsedate)
> parsedate::parse_date(x1)
[1] "2019-03-01 01:00:00 UTC"
> parsedate::parse_date(x2)
[1] "2019-01-23 17:00:00 UTC"
TarJae
  • 72,363
  • 6
  • 19
  • 66