0

I want to match 1900-01-01 to 2099-12-31 in each of these formats:

YYYY
YYYY-MM
YYYY-MM-DD

This is my current solution:

^(19|20)[0-9]{1}[0-9]{1}-?([0,1]{0,1}[0-2]{0,1}){0,1}-?([0-3]{0,1}[0-9]{0,1}){0,1}

But my solution has at least 4 critical bugs which I'm not able to fix:

  1. 1921-00 matches successfully

    There is no restriction "Only 1 of 2 digits in month or date can be 0, but not both of them" in my solution

  2. 1921- matches successfully

    There is no restriction "The last symbol of the date can be only digit, not hyphen" in my solution

  3. 1921-1 matches successfully

    There is no restriction "Month and date may contain only 0 or 2 digits, not 1 digit" in my solution

And the main:

  1. 1921-22 matches successfully

    There is no restriction "Date can't exist without month" in my solution

I'm using Python (if it matters). I'll be very grateful for help with adding this restrictions to my solution.

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77
jd2050
  • 182
  • 1
  • 2
  • 16
  • 1
    Not familiar with python but `[0,1]` should likely be `[01]` for 0 or 1, the comma allows commas. To allow month ranges use `0[1-9]|1[0-2]` – user3783243 Jun 16 '20 at 14:59
  • Do you expect to encounter the year 736? – MonkeyZeus Jun 16 '20 at 15:03
  • @MonkeyZeus No. The date should be in range from 1900-01-01 to 2099-12-31 – jd2050 Jun 16 '20 at 15:07
  • 3
    Are you interested in validating leap year dates? What about 2010-02-31? – MonkeyZeus Jun 16 '20 at 15:09
  • Is this a validiator or just a "date-like" format finder? – MonkeyZeus Jun 16 '20 at 15:09
  • Something like `^(19|20)\d{2}(-(0[1-9]|1[0-2])(-([0-2]\d|3[01]))?)?$` might be closer to what you want. Date ranges with regex are a mess usually though. You probably should use a date parser. Consider February and leap years. – user3783243 Jun 16 '20 at 15:09
  • @MonkeyZeus It would be perfect to take February features into account, but it's not critical. – jd2050 Jun 16 '20 at 15:27
  • Well then see my [answer](https://stackoverflow.com/a/62411817/2191572) if regex is what you're forced to use but I think the `datetime` answer is the right way to go. If you're interested in validating dates using regex then check out https://stackoverflow.com/q/8647893/2191572 because your question is just a duplicate of that – MonkeyZeus Jun 16 '20 at 15:30
  • At any rate, when asking a regex question you should consider all these things before posting your question so that it's not a cat-n-mouse game of getting you to think about the requirements. – MonkeyZeus Jun 16 '20 at 15:32
  • 1
    As a matter of fact, https://stackoverflow.com/a/8648129/2191572 goes from 1800-2099 which can easily be adjusted to fit your needs. You should go upvote that answer – MonkeyZeus Jun 16 '20 at 15:34
  • @Monkey, re 736, OP says the range begins 1-Jan-2900, but if were to go back, Sep 3-13, 1752 (Gregorian) would also be interesting. Nothing, absolutely nothing, happened during that period. – Cary Swoveland Jun 16 '20 at 17:10
  • @CarySwoveland Nothing happens every year on neither February 30th nor 31st so that period in 1752 isn't all that special; ditto for June 31st :-) – MonkeyZeus Jun 16 '20 at 17:34

2 Answers2

2

You can use the datetime module:

from datetime import datetime
dateformats = ("%Y", "%Y-%m", "%Y-%m-%d")
dates = ("2020", "2020-06", "2020-06-16", "2020-15", "2020-16-06", "1875-10-20")

for date_str in dates:
    for date_fmt in dateformats:
        try:
            date = datetime.strptime(date_str, date_fmt)
        except ValueError:
            pass
        else:
            if 1900 <= date.year <= 2099:
                print(f"{date_str} is valid.")
            else:
                print(f"{date_str} is not in valid range.")
            break
    else:
        print(f"{date_str} is not valid.")

Output:

2020 is valid.
2020-06 is valid.
2020-06-16 is valid.
2020-15 is not valid.
2020-16-06 is not valid.
1875-10-20 is not in valid range.
Asocia
  • 5,935
  • 2
  • 21
  • 46
1

Something like this seems to work:

^(19|20)\d{2}(?:-(?:0[1-9]|1[0-2])(?:-(?:0[1-9]|[12][0-9]|3[01]))?)?$
  • (19|20)\d{2} - 19 or 20 followed by 2 digits
  • (?: - start non-capturing group because we need some OR booleans
  • -(?:0[1-9]|1[0-2]) - a dash followed by a month between 01-12
  • (?:-(?:0[1-9]|[12][0-9]|3[01]))? - an optional day from 01-31
  • )? - make the month optional. The optional day is inside these parenthesis because we only care about the day if a month precedes it.

https://regex101.com/r/oj5c1K/1/

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77
  • 1
    Since this answer have fixed all bugs in question, I marked it as correct. Thank you for your time! – jd2050 Jun 16 '20 at 15:42