1

I'm trying to learn RegEx and tried to solve the below scenario:

Assume the the below string is a cell in a data input with thousands of rows:

"He who sleeps on the floor will not fall off the bed.Robert Gronock16-APR-2005This is a valid record."

I wrote the Regex ".+\s*(\d+-.+-\d+|[[:alpha:]]{3}\s*\d+,*\s\d+)" to extract the date only from the string but what I get is 6-APR-2005 missing the 1 at the beginning.

I don't want to use {2} as some rows has single digit like 8-Apr-2005 and it would return Null.

What I am doing wrong in the RegEx?

Usama Abdulrehman
  • 1,041
  • 3
  • 11
  • 21
  • 1
    The initial `.*` is being greedy and absorbing the first digit of the date. Make it lazy i.e. `.+?\s*(\d+-.+-\d+|[[:alpha:]]{3}\s*\d+,*\s\d+)` and it will work as desired. – Nick Jun 07 '20 at 02:15
  • Seems like over complicated regex. If your date will remain like `day numbers-month string-year numbers` then you can use something like this `\d{1,2}-\S{3}-\d{4}` – sallushan Jun 07 '20 at 02:30
  • @sallushan the rows contain multiple date formats: 16-APR-2005, 4-SEP-00, Jan 5 2000 and Feb 24, 1998. So this is the expression that I got to parse all these formats. Thanks – AliSalman86 Jun 07 '20 at 02:53
  • if you can provide possible date formats that may help us coming up some generic RegEx. – sallushan Jun 07 '20 at 02:54
  • @Nick worked like charm, thanks a lot – AliSalman86 Jun 07 '20 at 02:57
  • @AliSalman86 no worries - I"m glad it helped. – Nick Jun 07 '20 at 02:58
  • You could match strings against the regular expression `(?<!\d)\d{1,2}-[A-Z]{3}-\d{4}(?!\d)`. [Demo](https://regex101.com/r/OkMnct/1/). `\d{1,2}-[A-Z]{3}-\d{4}` matches the date. `(?<!\d)`, a negative lookbehind, asserts that the date is not preceded by a digit, so that you don't match, for example, `13-JAN-2000` in the string `913-JAN-2000`. `(?!\d)`, a negative lookahead, asserts that the match is not followed by a digit, to avoid matching, for example, `"13-JAN-2020"` in the string `"13-JAN-20201"`. – Cary Swoveland Jun 07 '20 at 03:51

0 Answers0