-4

I've created regex like this:

(((([1-9]|1[0-9]|2[0-8])[-]([1-9]|1[0-2]))|((29|30|31)[-]([13578]|1[02]))|((29|30)[-]([469]|11)))[-]([0-9][0-9][0-9][0-9]))|(29[-]2[-](([0-9][0-9])(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)))

Everything is going fine until those dates:

29-2-2017 (it matches 9-2-2017)

31-11-2017 (it matches 1-11-2017)

They don't exist or current year is not leap. How can I achieve not to match them as correct?

Working example below: https://regex101.com/r/mjfoAH/2

EDIT

I've managed finally to edit my regex to match format I need. Here it is for next generations:

((((\b[1-9]\b|1[0-9]|2[0-8])[-]([1-9]|1[0-2]))|((29|30|31)[-]([13578]|1[02]))|((29|30)[-]([469]|11)))[-]([0-9][0-9][0-9][0-9]))|(29[-]2[-](([0-9][0-9])(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)))

Working example: https://regex101.com/r/mjfoAH/3

P.S. About possible duplicate - posted topic is about regex for another date format. Also answer checked there as correct doesn't care about leap years. That's why I created this topic.

Ashiv3r
  • 894
  • 4
  • 14
  • 28
  • Use the date feature of the programming language you're using to verify a date – baao Nov 21 '17 at 13:44
  • 1
    [Use anchors?](https://regex101.com/r/Is7Iid/1) – Aaron Nov 21 '17 at 13:44
  • @baao - I can't in this case. I need to create regular expression for standard asp .net validator. – Ashiv3r Nov 21 '17 at 13:47
  • 1
    Possible duplicate of [Regular Expression to match valid dates](https://stackoverflow.com/questions/51224/regular-expression-to-match-valid-dates) - look at the second answer. – assylias Nov 21 '17 at 14:10
  • 1
    See [my answer to a date question a couple months ago](https://stackoverflow.com/a/46414732/3600709). The regex is configurable to your needs (you can create your own date or datetime format from the regex) and it does valid leap years. Note the edited section of my question performs slightly faster than the original answer – ctwheels Nov 21 '17 at 14:18
  • @ctwheels - Indeed it's quite good, but I need it written in one line specifically for my needs. – Ashiv3r Nov 21 '17 at 14:46
  • @Ashiv3r you can convert it into a one-liner. See the second chunk of code on that answer, it's the one-liner version of the regex above it. – ctwheels Nov 21 '17 at 14:48

2 Answers2

1

Brief

As per my comment, I've written an answer on this post that deals with dates and leap years. The regex is configurable to your needs. Below I've made those tweaks to create a regular expression that will work for your format.


Code

Regex (with definition construct)

See regex in use here

(?(DEFINE)
  (?# Date )
    (?# Day ranges )
    (?<d_day28>0[1-9]|1\d|2[0-8]|[1-9])
    (?<d_day29>0[1-9]|1\d|2\d|[1-9])
    (?<d_day30>0[1-9]|1\d|2\d|30|[1-9])
    (?<d_day31>0[1-9]|1\d|2\d|3[01]|[1-9])
    (?# Month specifications )
    (?<d_month28>0?2)
    (?<d_month29>0?2)
    (?<d_month30>0?[469]|11)
    (?<d_month31>0?[13578]|1[02])
    (?# Year specifications )
    (?<d_year>\d+)
    (?<d_yearLeap>(?:\d*?(?:(?:0[48]|[13579][26]|[2468][048])|(?:(?:[02468][048]|[13579][26])00))|[48]00|[48])(?=\D|\b))
    (?# Valid date formats )
    (?<d_format>
      (?&d_day28)-(?&d_month28)-(?&d_year)|
      (?&d_day29)-(?&d_month29)-(?&d_yearLeap)|
      (?&d_day30)-(?&d_month30)-(?&d_year)|
      (?&d_day31)-(?&d_month31)-(?&d_year)
    )
)
\b(?&d_format)\b

Regex (without definition construct/one-liner)

See regex in use here

\b(?:(?:0[1-9]|1\d|2[0-8]|[1-9])-(?:0?2)-(?:\d+)|(?:0[1-9]|1\d|2\d|[1-9])-(?:0?2)-(?:(?:\d*?(?:(?:0[48]|[13579][26]|[2468][048])|(?:(?:[02468][048]|[13579][26])00))|[48]00|[48])(?=\D|\b))|(?:0[1-9]|1\d|2\d|30|[1-9])-(?:0?[469]|11)-(?:\d+)|(?:0[1-9]|1\d|2\d|3[01]|[1-9])-(?:0?[13578]|1[02])-(?:\d+))\b

Explanation

Below I've copied the explanation from the linked post (my answer on another question). The explanation is pretty much the same (minus the time properties)

I'll explain the first version as the second version is simply a slimmed down version of it. Note that the regex can easily be changed to accommodate for more formats (only 1 format with slight variations is accepted, but this is a very customizable regex).

  • d_days28: Match any number from 01 to 28
  • d_days29: Match any number from 01 to 29
  • d_days30: Match any number from 01 to 30
  • d_days31: Match any number from 01 to 31
  • d_month28: Match months that may only have 28 days (February - thus 02)
  • d_month29: Match months that may only have 29 days (February - thus 02)
  • d_month30: Match months that only have 30 days (April, June, September, November - thus 04, 06, 09, 11)
  • d_month31: Match months that only have 31 days (January, March, May, July, August, October, December - thus 01, 03, 05, 07, 08, 10, 12)
  • d_year: Match any year (must have at least one digit \d)
  • d_yearLeap: I'll break this into multiple segments for better clarity
    • \d*?
      • Match any number of digits, but as few as possible
    • Match one of the following
      • (?:(?:(?!00)[02468][048]|[13579][26])|(?:(?:[02468][048]|[13579][26])00))
    • Match one of the following
      • (?:(?!00)[02468][048]|[13579][26]) - Match one of the following
        • One of 02468, followed by one of 048, but not 00
        • One of 13579, followed by one of 26
      • (?:(?:[02468][048]|[13579][26])00) - Match one of the following, followed by 00
        • One of 02468, followed by one of 048
        • One of 13579, followed by one of 26
      • [48]00 - Match 400 or 800
      • [48] - Match 4 or 8
    • (?=\D|\b) - Ensure what follows is either a non-digit character \D or word boundary character \b
  • d_format: This points to previous groups in order to ensure months are properly formatted and match the days/month and days/year(leap year) requirements so that we can ensure proper date validation
  • t_period: This was added in case others needed this for validation purposes
    • Ensures the period is either am, pm, a.m, p.m or their respective uppercase versions (including things such as a.M where multliple cases are used)
  • t_hours12: Match any hour from 00 to 11
  • t_hours24: Match any hour from 00 to 23
  • t_minutes: Match any minutes from 00 to 59
  • t_seconds: Match any seconds from 00 to 59
  • t_milliseconds: Match any 3 digits (000 to 999)
  • t_format: This points to previous groups in order to ensure time is properly formatted. I've added an additional time setting (as well as an addition including milliseconds and time period for others' use)
  • dt_format: Datetime format to check against (in your case it's date time - separation by a space character)
  • Following the define block is \b(?&dt_format)\b, which simply matches the dt_format as specified above, ensuring what precedes and supercedes it is a word boundary character (or no character) \b

Leap year

To further understand the leap year section of the regex...

I am assuming the following:

  • All years are NOT leap years, unless, the following is true
    • ((Year modulo 4 is 0) AND (year modulo 100 is not 0)) OR (year modulo 400 is 0)
    • Source: leap year calculation
    • Leap years have always existed (at least since year 1) - since I don't want to start assuming and do even more research.

The regex works by ensuring:

  1. All leap years that end in 0, 4, 8 are preceded by a 0, 2, 4, 6, 8 (all of which result in 0 after modulus -> i.e. 24 % 4 = 0)
  2. All leap years that end in 2, 6 are **preceded* by a 1, 3, 5, 7, 9 (all of which result in 0 after modulus -> i.e. 32 % 4 = 0)
  3. All leap years that end in 00, for 1. and 2., are negated ((?!00) does this)
  4. All leap years that end in 00 are preceded by 1. and 2. (exactly the same since 4 * 100 = 400 - nothing needs to be changed except the last two digits)
  5. Add the years 400, 800, 4, 8 since they are not satisfied by any of the above conditions
ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • Almost got it, but the format I need suppose to valid only d/M/yyyy, but is allows dd/MM/yyyy too :) – Ashiv3r Nov 21 '17 at 15:06
  • So you don't need to match dates that are preceded by `0`? So instead of matching `01-01-2017`, you want to match `1-1-2017`? – ctwheels Nov 21 '17 at 15:07
  • Yes, it's only for certain culture info. I've already got regex for dd/mm/yyyy done :) – Ashiv3r Nov 21 '17 at 15:08
  • 1
    In that case see [this](https://regex101.com/r/bnUaF8/4) regex, it no longer matches preceding `0`. Why wouldn't you just use the same regex for both cases though? – ctwheels Nov 21 '17 at 15:11
  • Almost, it doesn't match 31-12-2017 which is correct... And I need to create different regex'es for different culture infos :) – Ashiv3r Nov 21 '17 at 17:43
  • I've added an answer for my problem in edit section. – Ashiv3r Nov 22 '17 at 09:04
0

SAME INFO INSIDE EDIT SECTION IN FIRST POST

I've managed finally to edit my regex to match format I need. Here it is for next generations:

((((\b[1-9]\b|1[0-9]|2[0-8])[-]([1-9]|1[0-2]))|((29|30|31)[-]([13578]|1[02]))|((29|30)[-]([469]|11)))[-]([0-9][0-9][0-9][0-9]))|(29[-]2[-](([0-9][0-9])(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)))

Working example: https://regex101.com/r/mjfoAH/3

Ashiv3r
  • 894
  • 4
  • 14
  • 28