-1

I have some data which comes in like the following:

  1. 8 Oct 2019
  2. 21 Jul 201621 Jul 2016
  3. 2 Apr 20202 April 20202 April 2020

I am trying to find a regex which will remove any duplicate instances. I want my end result to look like 1 rather than 2 or 3. I have looked online but the regex's i found where comma separated or newline separated. Whereas mine is in one line and isn't separated by comma, spaces, or newline. Could anyone tell me a suitable regex please?

Many thanks!

user5903386
  • 47
  • 1
  • 7

1 Answers1

-1

You can try this

Note: This regex assumes that the date is always 1 or 2 digits, month is always 3 letters & year is 4 digits. If this pattern changes, then it will not work. You can all scenarios in your question to get a better answer.

items = [ '8 Oct 2019','21 Jul 201621 Jul 2016','2 Apr 20202 April 20202 April 2020']
for item in items:
    print(re.search(r'\d{,2}\s\w{3}\s\d{4}',item).group(0))

output

8 Oct 2019
21 Jul 2016
2 Apr 2020
moys
  • 7,747
  • 2
  • 11
  • 42
  • thanks, i tried putting the green regex into a regex tester and i added in my strings that i listed in my question, but it did not match :( – user5903386 Oct 26 '20 at 15:41