4

I have a data frame column of dates (in character format) with a mixture of dates:

("Apr11", "2005-01-01", "Apr13", "2009-12-01")

I tried to use the lubridate parse_date_time() function to parse these dates as follows:

parse_date_time(x = variable, orders = c('y-m-d', 'by'), locale='en_US.UTF-8')

While parse_date_time() is able to parse dates with the format 'y-m-d', it fails to parse dates with the format 'by'.

I then tried to experiment with a toy example but with no success:

z <- "Apr11"
parse_date_time(z,"by")

I keep getting the same error:

[1] NA
Warning message:
All formats failed to parse. No formats found. 

I read the documentation and tried a number of different things with no luck:

  • I set locale to the default_locale() function.
  • I tried B instead of b
  • I tried setting setting the exact option to True as given by the note in the documentation
  • I tried the parse_date_time2() function which worked on the toy example, but didn't work on the original dataframe with the mixture of dates (This threw a warning message - Multiple orders supplied. Only first order is used.)

I realize that b is locale sensitive but I don't see the issue here - Here's the output from default_locale()

<locale>
Numbers:  123,456.78
Formats:  %AD / %AT
Timezone: UTC
Encoding: UTF-8
<date_names>
Days:   Sunday (Sun), Monday (Mon), Tuesday (Tue), Wednesday (Wed), Thursday (Thu), Friday (Fri), Saturday (Sat)
Months: January (Jan), February (Feb), March (Mar), April (Apr), May (May), June (Jun), July (Jul), August (Aug), September (Sep),
    October (Oct), November (Nov), December (Dec)
AM/PM:  AM/PM

And here's the output from Sys.getlocale()

[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

I've also looked into various posts on stackoverflow but nothing helped:

Thanks in advance!

Cola4ever
  • 189
  • 1
  • 1
  • 16
  • I think the problem is you are not supplying a day - try for example `parse_date_time("1Apr11","dby")` – konvas Oct 18 '17 at 15:49
  • but parse_date_time2() works without supplying the day - am not sure if this would change anything. – Cola4ever Oct 18 '17 at 15:58
  • Good point, I am not sure why this is, the two parsers must be doing something slightly different under the hood. But the problem is parsing this individual date, not the vector, try e.g. `parse_date_time(c("1Apr11", "2015-05-10"), c("dby", "Y-m-d"))` - I suppose I didn't understand your original question and thought it was about vectorizing it rather than parsing that particular string – konvas Oct 18 '17 at 16:05
  • Thanks! I initially tried parse_date_time2() which didn't work but parse_date_time works as suggested. I just need to append 1 to the dataset - wish I could avoid doing this – Cola4ever Oct 18 '17 at 16:09
  • This could be a bug in `parse_date_time` actually. Could be worth checking with the developers by raising an issue at https://github.com/tidyverse/lubridate/issues – konvas Oct 18 '17 at 16:13

0 Answers0