0

I am trying to extract the date from a string of text, and I have been able to extract the dates in the format "m/dd/yyyy" (i.e. January to September), but my code doesn't seem to work for the "mm/dd/yyyy" format (i.e. October to December) - see code below. It appears as 02/27/2019 when it needs to appear as 12/27/2019. Anyone know how I can get a code to work for all months?

string <- "20 French From home 12/27/2019"

gsub( ".*(\\d{1,2}/\\d{1,2}/\\d{4}).*", "\\1", string)

  • 1
    Use `gsub( ".*?(\\d{1,2}/\\d{1,2}/\\d{4}).*", "\\1", string)` ([demo](https://tio.run/##K/r/v7ikKDMvXcFGV0HJyEDBrSg1LzkDSOXnKmTk56YqGBrpG5nrGxkYWipxcaUXlyZpKCjpadlrxMSkVBvqGNXqozBMajX1tJR0FJRiYgyBFMRszf//AQ)) – Wiktor Stribiżew Aug 19 '20 at 14:25
  • It didn't. You managed to replace one hard-coded localized date literal with another. What if you have to deal with data from a *different* country? What if you have to parse German or Russian data, where the separator is `.` ? The real solution is to use code that manages different locales, not try to convert one ambiguous format to another. – Panagiotis Kanavos Aug 19 '20 at 14:32
  • BTW the duplicate is 10000% wrong - to parse localized dates you need to use the appropriate locale, not regular expressions. A good duplicate would show how to use eg Lubridate with a specific locale – Panagiotis Kanavos Aug 19 '20 at 14:33
  • @PanagiotisKanavos Thank you for your input but the French part actually doesn't have anything to do with the country, it is a type of indwelling urinary catheter used in a hospital. The date at the end indicates the insertion date that I needed to extract and will always be in the same format – Julianne Kubes Aug 19 '20 at 18:14
  • @JulianneKubes never mind that errors even in accounting software are far more dangerous in healthcare. Why do you want to have to test **all of your code** for date parsing problems instead of using proper date parsing like all data science and analytic applications do? Why do you want to parse *all of your code* to ensure 4/7 means July 4 - or is it April 7? Better yet, why do you want to write **two lines of code** when you can make [just a single call to parse_date_time](https://lubridate.tidyverse.org/reference/parse_date_time.html)? – Panagiotis Kanavos Aug 20 '20 at 07:58

0 Answers0