17

I need help coming up with a regex to make sure the user enters a valid date The string will be in the format of mm/dd/yyyy

Here is what I have come up with so far.

/\[1-9]|0[1-9]|1[0-2]\/\d{1,2}\/19|20\d\d/

I have validated the regex where the user cannot enter a day higher than 12 and the years have to start with either "19" or "20". What I am having trouble is figuring out some logic for validating the day. The day should not go over 31.

Zerobu
  • 2,561
  • 7
  • 25
  • 32
  • 10
    A regex seems like a strange tool for this problem. Can you not convert the text to a date and check its values? – Tom Hamming May 20 '11 at 20:01
  • 3
    While this could theoretically be done, a regex that handled the correct number of days for each month (including leap years) would be insanely complex. Why not just split the date up and test each component? (Or better yet, use one of the many date parsers on CPAN.) – cjm May 20 '11 at 20:06
  • @Zerobu, why do you ask the same question again than a year ago??? [questions/2573466/matching-a-date-in-perl](http://stackoverflow.com/questions/2573466/matching-a-date-in-perl). OK, a year ago you didn't got a regex answer, but I hope you see from @Seth answer regex is not useful to validate a date. – stema May 20 '11 at 20:39
  • If I wanted all conditions such as the leap year, then I would have said so – Zerobu May 20 '11 at 20:49
  • @Zerobu, my point is simply that most people start thinking they just need a simple regex for a date string, but end up needing something "real". If that's not the case for you, no problem :) If it is, well, my answer will still be here when you need it! – Ryley May 20 '11 at 20:53
  • 2
    Zerobu: you said so: "a regex to make sure the user enters a valid date". – ysth May 20 '11 at 21:17
  • Unless using a Regex is a static/inmutable requirement, consider using other approaches, like .net's DateTime.TryParseExact(value, format) – Cleptus Sep 26 '19 at 10:30

11 Answers11

49

Regex for 0-31:

(0[1-9]|[12]\d|3[01])

Or if you don't want days with a preceding zero (e.g. 05):

([1-9]|[12]\d|3[01])
bluepnume
  • 16,460
  • 8
  • 38
  • 48
  • Thanks apparently the other posters had a hard time figuring out what i was looking for, even though I already said what I needed – Zerobu May 20 '11 at 20:47
  • 3
    That's because the date won't necessarily be valid. It won't, for example, reject 31 February 2011. Regex it the wrong tool for this. – MRAB May 20 '11 at 22:56
  • 1
    was 1974 a leap year? what about 2000? – Joel Berger May 21 '11 at 03:18
  • Wouldn't this allow a value of '00'? – Jeromy French Apr 08 '13 at 19:56
  • 3
    This doesn't allow for days ending in zero -- "10" or "20". – Joe Krill Oct 10 '13 at 17:19
  • Doesn't work for 10 and 20. See: `gawk 'BEGIN{ for(i=0;i<=32;i++){ if (i ~ /^([0-2]?[1-9]|3[01])$/){print i "yes"}else {print i "no"} } }'` – Kent Pawar Dec 09 '13 at 14:27
  • The first one works fine now; but I am seeing that the 2nd regex matches 0, 32, and misses to match double digit numbers.. Could you kindly check - [test](http://regexr.com?37lbr). Also you could refer to [my post below](http://stackoverflow.com/a/20473796/985766). Cheers – Kent Pawar Dec 15 '13 at 07:09
16
  • As many have noted above, if we want to validate the date as a whole then a RegEx is a very poor choice.
  • But if we want to match a pattern of numbers, in this case from 01-31 then RegEx is fine so long as there is some backend logic that validates the date as a whole, if so desired.
  • I see the expected answer currently fails for 10, 20.

    • Test: gawk 'BEGIN{ for(i=0;i<=32;i++){ if (i ~ /^([0-2]?[1-9]|3[01])$/){print i " yes"}else {print i " no"} } }
    • This can be corrected as follows: ^([0-2]?[1-9]|3[01]|10|20)$

So kindly consider the following solution...

1. Identify the sets that need to be matched:

  • Days with prefix "0": {01,...,09},{10,...,31}
    • Sub-set {10,...,31} can be split into => {10,...,29},{30,31}
  • Without any prefix: {1,...,31} => {1,...,9},{10,...,31}

2. Corresponding regular expressions for each sub-set:

---------------------------------
Sub-Set     |  Regular-Expression
---------------------------------
{01,...,09} | [0][1-9]
{10,...,29} | [1-2][0-9]
{30,31}     | 3[01]
{1,...,9}   | [1-9]
---------------------------------

Now we can group ([0][1-9]) and ([1-9]) together as ([0]?[1-9]). Where ? signifies 0 or 1 occurrences of the pattern/symbol. [UPDATE] - Thank you @MattFrear for pointing it out.

So the resulting RegEx is: ^(([0]?[1-9])|([1-2][0-9])|(3[01]))$

Tested here: http://regexr.com/?383k1 [UPDATE]

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Kent Pawar
  • 2,378
  • 2
  • 29
  • 42
  • 1
    Top marks for the explanation. The original question isn't worded particularly well (not the OPs fault) but if you're looking for REGEXP to validate a day of the month this is it. – Betjamin Richards Jan 08 '14 at 09:30
  • 2
    An improvement over the accepted answer. However, this will accept 001, 00000001, etc. I suggest ^(([0]?[1-9])|([1-2][0-9])|(3[01]))$ http://regexr.com?383k1 – Matt Frear Jan 23 '14 at 10:08
  • 1
    @MattFrear - Yes good catch! I made a mistake while combining `([0][1-9])` and `([1-9])` together as `([0]*[1-9])`... while it should clearly be `([0]?[1-9])`. Thank you for sharing the test case. – Kent Pawar Jan 23 '14 at 10:16
14
use DateTime;

Other solutions are fine, probably work, etc. Usually, you end up wanting to do a bit more, and then a bit more, and eventually you have some crazy code, and leap years, and why are you doing it yourself again?

DateTime and its formatters are your solution. Use them! Sometimes they are a bit overkill, but often that works out for you down the road.

my $dayFormat = new DateTime::Format::Strptime(pattern => '%d/%m/%Y');
my $foo = $dayFormat->parse_datetime($myDateString);

$foo is now a DateTime object. Enjoy.

If your date string wasn't properly formatted, $foo will be "undef" and $dayFormat->errstr will tell you why.

ysth
  • 96,171
  • 6
  • 121
  • 214
Ryley
  • 21,046
  • 2
  • 67
  • 81
4
^(((((((0?[13578])|(1[02]))[\.\-/]?((0?[1-9])|([12]\d)|(3[01])))|(((0?[469])|(11))[\.\-/]?((0?[1-9])|([12]\d)|(30)))|((0?2)[\.\-/]?((0?[1-9])|(1\d)|(2[0-8]))))[\.\-/]?(((19)|(20))?([\d][\d]))))|((0?2)[\.\-/]?(29)[\.\-/]?(((19)|(20))?(([02468][048])|([13579][26])))))$

From Expressions in category: Dates and Times

Validates the correct number of days in a month, looks like it even handles leap years.

You can of course change [\.\-/] with / to only allow slashes.

Kent Pawar
  • 2,378
  • 2
  • 29
  • 42
Seth Robertson
  • 30,608
  • 7
  • 64
  • 57
  • 14
    +1 for a comical amount of complexity. @Zerobu, hopefully this shows you why regex is a bad idea for this problem! Even this beast isn't fully correct: as the original author notes, it incorrectly matches 02/29/1900. – OpenSauce May 20 '11 at 20:23
2

This isn't all that hard...

qr#^
    (?: 0[1-9] | 1[012] )
    /
    (?:
        0[1-9] | 1[0-9] | 2[0-8]
        | (?<! 0[2469]/ | 11/ ) 31
        | (?<! 02/ ) 30
        | (?<! 02/
             (?= ... 
                 (?: 
                     .. (?: [02468][1235679] | [13579][01345789] )
                     | (?: [02468][1235679] | [13579][01345789] ) 00
                 )
             )
        ) 29
    )
    /
    [0-9]{4}
    \z
#x
ysth
  • 96,171
  • 6
  • 121
  • 214
2

If you want to check for valid dates, you have to do much more than check numbers and ranges. Fortunately, Perl already has everything you need for this. The Time::Piece module comes with Perl and can parse a date. It knows how to parse dates and do the first round of checks:

use v5.10;

use Time::Piece; # comes with Perl

my @dates = qw(
    01/06/2021 01/37/456 10/6/1582 10/18/1988
    2/29/1900 2/29/1996 2/29/2000
    );

foreach my $date ( @dates ) {
    my $t = eval { Time::Piece->strptime( $date, '%m/%d/%Y' ) };
    unless( $t ) {
        say "Date <$date> is not valid";
        next;
        }
    say $t;
    }

The output is interesting and no other solution here is close to handling this. Why is 10/6/1582 an invalid date? It doesn't exist in the Gregorian calendar, but there's a simpler reason here. strptime doesn't handle dates before 1900.

But also notice that 2/29/1900 gets turned into 3/1/1900. That's weird and we should fix that, but there's no leap years in years divisible by 100. Well, unless they are divisible by 400, which is why 2/29/2000 works.

Wed Jan  6 00:00:00 2021
Date <01/37/456> is not valid
Date <10/6/1582> is not valid
Tue Oct 18 00:00:00 1988
Thu Mar  1 00:00:00 1900
Thu Feb 29 00:00:00 1996
Tue Feb 29 00:00:00 2000

But let's fix that leap year issue. The tm struct is going a dumb conversion. If the individual numbers are within a reasonable range (0 to 31 for days) regardless of the month, then it converts those days to seconds and adds them to the offset. That's why 2/29/1900 ends up a day later: that 29 gives the same number of seconds as 3/1/1900. If the date is valid, it should come back the same. And since I'm going to roundtrip this, I fix up the date for leading zeros before I do anything with it:

use v5.10;

use Time::Piece; # comes with Perl

my @dates = qw(
    01/06/2021 2/29/1900 2/2/2020
    );

foreach my $date ( @dates ) {
    state $format = '%m/%d/%Y';
    $date =~ s/\b(\d)\b/0$1/g;  # add leading zeroes to lone digits
    my $t = eval { Time::Piece->strptime( $date, $format ) };
    unless( $t ) {
        say "Date <$date> is not valid";
        next;
        }
    unless( $t->strftime( $format ) eq $date ) {
        say "Round trip failed for <$date>: Got <"
            . $t->strftime( $format ) . ">";
        next;
        };
    say $t;
    }

Now the output is:

Wed Jan  6 00:00:00 2021
Round trip failed for <02/29/1900>: Got <03/01/1900>
Sun Feb  2 00:00:00 2020

That's all a bit long, but that's why we have subroutines:

if( date_is_valid( $date ) ) { ... }

Still want a regex? Okay, lets use the (??{...}) construct to decide if a pattern should fail. Match a bunch of digits and capture that into $1. Now, use (??{...}) to make the next part of the pattern, using any Perl code you like. If you accept the capture, return a null pattern. If you reject it, return the pattern (*FAIL), which immediately causes the whole match to fail. No more tricky alternations. And this one uses the new chained comparison in v5.32 (although I still have misgivings about it):

use v5.32;

foreach ( qw(-1 0 1 37 31 5 ) ) {
    if( /\A(\d+)(??{ (1 <= $1 <= 31) ? '' : '(*FAIL)' })\z/ ) {
        say "Value <$1> is between 1 and 31";
        }
    }
brian d foy
  • 129,424
  • 31
  • 207
  • 592
  • See also: http://blogs.perl.org/users/e_choroba/2019/12/perl-weekly-challenge-038-date-finder-and-word-game.html – brian d foy Jan 29 '21 at 18:20
1

Regex for 0-31 day:

0[1-9]|[12]\d|3[01]) without prefix 0 - when "1", "23"...

([1-9]|[12]\d|3[01]) with prefix 0 - when "01", "04"

(0?[1-9]|[12]\d|3[01]) - with or without "0" - when ""

RomanV
  • 549
  • 4
  • 6
1

Simpler regex:

([12]\d|3[01]|0?[1-9])

Consider the accepted answer and this expression:

(0[1-9]|[12]\d|3[01])

This matches 01 but not 1

The other expression in the accepted answer:

([1-9]|[12]\d|3[01])

This matches 1 but not 01

It is not possible to add an OR clause to get them both working.

The one I suggested matches both. Hope this helps.

Prashant Saraswat
  • 838
  • 1
  • 8
  • 20
  • How is this better than accepted answer (that you've copied/pasted°? – Toto Sep 06 '19 at 19:09
  • 1
    The accepted answer uses two different expressions to deal with days beginning with 0 or not. This one takes care of both in one expression. The behavior is similar to Kent Pawar's answer but the expression is much simpler. Also I couldn't plugin Kent's expression into another expression where I was trying to create regex for say YYMMDD etc – Prashant Saraswat Sep 06 '19 at 19:22
  • Can you please put the explanation as to why this is a better in the body of your answer? – mjuarez Sep 06 '19 at 21:49
  • OP says `MM` and "_this_ _matches_ _1_ _but_ _not_ _01_". Do note that to fit `MM` it should not match `1` and should match `01` instead – Cleptus Sep 26 '19 at 10:22
  • This is exactly what I was looking for and in a very concise form. Accepts 01 and 1. Very good, ignore the haters. :-) Accepted answer doesn't do both. – blissweb Aug 22 '21 at 03:08
1

Try it:
/(0?[1-9]|1[012])\/(0?[1-9]|[12][0-9]|3[01])\/((19|20)\d\d)/

fvox
  • 1,077
  • 6
  • 8
1

Is regular expression a must? If not, you better off using a different approach, such as DateTime::Format::DateManip

my @dates = (
    '04/23/2009',
    '01/22/2010 6pm',
    'adasdas',
    '1010101/12312312/1232132'
);

for my $date ( @dates ) 
{
    my $date = DateTime::Format::DateManip->parse_datetime( $date );
    die "Bad Date $date"  unless (defined $date);
    print "$dt\n";
}
snoofkin
  • 8,725
  • 14
  • 49
  • 86
-1

I have been working with this some time and the best regex I've came up with is the following:

\b(0)?(?(1)[1-9]|(3)?(?(2)[01]|[12][0-9]))\b|\b[1-9]\b

It will match the following numbers:

1 01 10 15 20 25 30 31

It does not match the following:

32 40 50 90
  • 1
    The OP problem is totally different to what this answer tries to answer – Cleptus Sep 26 '19 at 08:39
  • @bradbury9 the question wants a regex to match a number in the range 1-31, that's exactly what this answer claims to do, isn't it? – moopet Sep 26 '19 at 09:34
  • I think that this pattern is quite accurate on plausible different situations. You may find days written as 01 or 1 in different contexts. Let's say when writing dates (E.g: 01/02/2019) or in more formal documents... etc. – Manuel Martin Sep 27 '19 at 09:36