If you want to check for valid dates, you have to do much more than check numbers and ranges. Fortunately, Perl already has everything you need for this. The Time::Piece module comes with Perl and can parse a date. It knows how to parse dates and do the first round of checks:
use v5.10;
use Time::Piece; # comes with Perl
my @dates = qw(
01/06/2021 01/37/456 10/6/1582 10/18/1988
2/29/1900 2/29/1996 2/29/2000
);
foreach my $date ( @dates ) {
my $t = eval { Time::Piece->strptime( $date, '%m/%d/%Y' ) };
unless( $t ) {
say "Date <$date> is not valid";
next;
}
say $t;
}
The output is interesting and no other solution here is close to handling this. Why is 10/6/1582 an invalid date? It doesn't exist in the Gregorian calendar, but there's a simpler reason here. strptime
doesn't handle dates before 1900.
But also notice that 2/29/1900
gets turned into 3/1/1900
. That's weird and we should fix that, but there's no leap years in years divisible by 100. Well, unless they are divisible by 400, which is why 2/29/2000
works.
Wed Jan 6 00:00:00 2021
Date <01/37/456> is not valid
Date <10/6/1582> is not valid
Tue Oct 18 00:00:00 1988
Thu Mar 1 00:00:00 1900
Thu Feb 29 00:00:00 1996
Tue Feb 29 00:00:00 2000
But let's fix that leap year issue. The tm
struct is going a dumb conversion. If the individual numbers are within a reasonable range (0 to 31 for days) regardless of the month, then it converts those days to seconds and adds them to the offset. That's why 2/29/1900 ends up a day later: that 29 gives the same number of seconds as 3/1/1900. If the date is valid, it should come back the same. And since I'm going to roundtrip this, I fix up the date for leading zeros before I do anything with it:
use v5.10;
use Time::Piece; # comes with Perl
my @dates = qw(
01/06/2021 2/29/1900 2/2/2020
);
foreach my $date ( @dates ) {
state $format = '%m/%d/%Y';
$date =~ s/\b(\d)\b/0$1/g; # add leading zeroes to lone digits
my $t = eval { Time::Piece->strptime( $date, $format ) };
unless( $t ) {
say "Date <$date> is not valid";
next;
}
unless( $t->strftime( $format ) eq $date ) {
say "Round trip failed for <$date>: Got <"
. $t->strftime( $format ) . ">";
next;
};
say $t;
}
Now the output is:
Wed Jan 6 00:00:00 2021
Round trip failed for <02/29/1900>: Got <03/01/1900>
Sun Feb 2 00:00:00 2020
That's all a bit long, but that's why we have subroutines:
if( date_is_valid( $date ) ) { ... }
Still want a regex? Okay, lets use the (??{...})
construct to decide if a pattern should fail. Match a bunch of digits and capture that into $1
. Now, use (??{...})
to make the next part of the pattern, using any Perl code you like. If you accept the capture, return a null pattern. If you reject it, return the pattern (*FAIL)
, which immediately causes the whole match to fail. No more tricky alternations. And this one uses the new chained comparison in v5.32 (although I still have misgivings about it):
use v5.32;
foreach ( qw(-1 0 1 37 31 5 ) ) {
if( /\A(\d+)(??{ (1 <= $1 <= 31) ? '' : '(*FAIL)' })\z/ ) {
say "Value <$1> is between 1 and 31";
}
}