-1

I am looking to match a date string like below:

string1 = '11/13/2019 - 11/13/2019' 
string2 = '11/14/2019 11/14/2019'
string3 = '01/21/2019. . 11/20/2019'

I am using the below code to fetch all of them:

match : r"(\d+[/1]\d+[/1]\d+[ - ]\d+[/1]\d+[/1]\d+)"

But the above is giving me only string 1.

All strings have two dates in common, I wanted to match it with both the date formats, ignoring the characters in between them. Is there a way to do this? ignore '-', ' ', '. .'

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
pylearner
  • 1,358
  • 2
  • 10
  • 26
  • Read [converting-string-into-datetime](https://stackoverflow.com/questions/466345) – stovfl Apr 25 '20 at 08:14
  • What is the purpose of `[/1]`? Your regex matches `111111111111 11111111111111`, not sure that's a valid date! – Toto Apr 25 '20 at 09:34

4 Answers4

3

Try with this: (\d+[/]\d+[/]\d+[-\s\.]*\d+[/]\d+[/]\d+)

Demo here

GolamMazid Sajib
  • 8,698
  • 6
  • 21
  • 39
1

Building on the @GolamMazidsajib’s answer, if you want to be a bit more strict about it your regex should look like this:

^\d{2}/\d{2}/\d{4}[- .]+\d{2}/\d{2}/\d{4}$
  1. adding ^ at the beginning and $ at the end ensures that the string contains only the date range, not something 11/14/2019 11/14/2019 something (unless you want it to, then just remove these)
  2. {2} and {4} instead of + after \d-s will match only the specified number of digits, so that the strings like 111111/14/2019 11/14/2019 or 1/14/2019 11/14/2019 won’t be accepted
  3. + instead of * after character set [-\s\.] will require strings to have at least one separator character between the dates; rejecting, for example, 11/14/201911/14/2019
  4. took place of \s to filter out strings that contain any other whitespace separators, like a new line \n or a tab \t

I also simplified it a bit:

  • removed unnecessary character sets with a single slash in them: both [/] and / match just one character "/"
  • removed escape of the dot . inside the character set [- .], because inside a set it behaves like a literal and not a special character
  • removed the grouping parenthesis around the whole regex, since you’d need those only if you want to extract the matched substring, which guessing from the question you don’t

See the demo.


Of course, we can go further (perhaps, even a bit overboard) and try to get closer to matching only the valid dates, with something like this:

^(?:0[1-9]|1[0-2])/(?:0[1-9]|[12]\d|3[01])/\d{4}[- .]+(?:0[1-9]|1[0-2])/(?:0[1-9]|[12]\d|3[01])/\d{4}$
  • (?:0[1-9]|1[0-2]) is a non-capturing group that matches either (notice |)
    • 0[1-9]0 followed by a digit from 1 to 9, or
    • 1[0-2]1 followed by 0, 1, or 2
      this way we require month’s number to be one of 01, 02, 03, …,12 and nothing else
  • (?:0[1-9]|[12]\d|3[01]) is also a non-capturing group that matches either of the three sequences:
    • 0[1-9]0 followed by a digit from 1 to 9, or
    • [12]\d1 or 2 followed by any digit, or
    • 3[01]3 with either 0 or 1 after it
      here we require the day be one of 01, 02, 03, …, 31 and nothing else. Note, that it doesn’t check if the month matches the day number, so 02/31/2020 (31th of February) will be allowed
  • \d{4} left as is, allowing years to be from 0000 to 9999

See the examples in another demo.

sainaen
  • 1,498
  • 9
  • 18
0

I tried using this and this worked.

[r"(\d+/\d+/\d+\s-\s\d+/\d+/\d+)", r"(\d+/\d+/\d+\s+\d+/\d+/\d+)"]
pylearner
  • 1,358
  • 2
  • 10
  • 26
0

You can use this single regex for both format:

^\d+/\d+/\d+\s(?:-\s)?\d+/\d+/\d+$

Demo & explanation

Toto
  • 89,455
  • 62
  • 89
  • 125