17

I want to match dates with format mm/dd/yy or mm/dd/yyyy but it should not pick 23/09/2010 where month is 23 which is invalid nor some invalid date like 00/12/2020 or 12/00/2011.

Phrogz
  • 296,393
  • 112
  • 651
  • 745
Lohith MV
  • 3,798
  • 11
  • 31
  • 44
  • That's not an easy task (although it is probably possible). You have to handle leap years within the regex to do that. – sawa May 12 '11 at 18:02
  • @sawa And the non-leap centuries, except the % 400 leap centuries. – Phrogz May 12 '11 at 18:56

7 Answers7

43

Better than a crazy huge Regex (assuming this is for validation and not scanning):

require 'date'
def valid_date?( str, format="%m/%d/%Y" )
  Date.strptime(str,format) rescue false
end

And as an editorial aside: Eww! Why would you use such a horribly broken date format? Go for ISO8601, YYYY-MM-DD, which is a valid international standard, has a consistent ordering of parts, and sorts lexicographically as well.

Phrogz
  • 296,393
  • 112
  • 651
  • 745
  • 1
    +1 for all recommendations. Regex is great for grabbing parts, but stinks for validating ranges. And, yes, the ISO8601 format can cure a lot of ills. – the Tin Man May 12 '11 at 18:02
  • 2
    I read this at an earlier date and didn't really understand why lexicographical ordering was a good thing. It allows you to sort dates or find lowest/highest date using string comparison! e.g `'2014-01-01 > '2013-12-12' = true`. This would fail using `mm/dd/yy` or anything similar. – Subtletree Jun 26 '14 at 00:02
  • sorry but, as told to @Simon-Woker , the question asked to a reg exp and not a generic solution to validate date so, for me, is -1 – Filippo1980 Aug 01 '18 at 10:22
  • @Filippo1980 Thanks for explaining. It's a reasonable reason to downvote. – Phrogz Aug 01 '18 at 13:52
26

You'd better do a split on / and test all individual parts. But if you really want to use a regex you can try this one :

#\A(?:(?:(?:(?:0?[13578])|(1[02]))/31/(19|20)?\d\d)|(?:(?:(?:0?[13-9])|(?:1[0-2]))/(?:29|30)/(?:19|20)?\d\d)|(?:0?2/29/(?:19|20)(?:(?:[02468][048])|(?:[13579][26])))|(?:(?:(?:0?[1-9])|(?:1[0-2]))/(?:(?:0?[1-9])|(?:1\d)|(?:2[0-8]))/(?:19|20)?\d\d))\Z#

Explanation:

\A           # start of string
 (?:         # group without capture
             # that match 31st of month 1,3,5,7,8,10,12
   (?:       # group without capture
     (?:     # group without capture
       (?:   # group without capture
         0?  # number 0 optionnal
         [13578] # one digit either 1,3,5,7 or 8
       )     # end group
       |     # alternative
       (1[02]) # 1 followed by 0 or 2
     )       # end group
     /       # slash
     31      # number 31
     /       # slash
     (19|20)? #numbers 19 or 20 optionnal
     \d\d    # 2 digits from 00 to 99 
   )         # end group
|
   (?:(?:(?:0?[13-9])|(?:1[0-2]))/(?:29|30)/(?:19|20)?\d\d)
|
   (?:0?2/29/(?:19|20)(?:(?:[02468][048])|(?:[13579][26])))
|
   (?:(?:(?:0?[1-9])|(?:1[0-2]))/(?:(?:0?[1-9])|(?:1\d)|(?:2[0-8]))/(?:19|20)?\d\d)
 )
\Z

I've explained the first part, leaving the rest as an exercise.

This match one invalid date : 02/29/1900 but is correct for any other dates between 01/01/1900 and 12/31/2099

Toto
  • 89,455
  • 62
  • 89
  • 125
  • 1
    +1 for recommending split. It's the simplest way with those specs. – kikito May 12 '11 at 14:26
  • @egarcia: Thanks. Sure it's certainly better and also more readable. – Toto May 12 '11 at 14:29
  • If you're going to anchor the regex for single-string validation then you should use `\A` and `\z` instead of `^` and `$`. – Phrogz May 12 '11 at 19:05
  • @Phrogz: It depends on regex flavor. But, you're right for Ruby. – Toto May 13 '11 at 07:51
  • what if the user doesn't enter slashes at all? the split method won't work. – E.E.33 Feb 04 '13 at 18:06
  • @E.E.33: if there are no delimiters at all, you can't validate nothing: is `1122012` december 1rst or february 11th ? – Toto Feb 04 '13 at 18:41
  • @M42 but with a regex you could at least verify that the user is entering improper format. Using .split, how do you handle the user entering a improper format such as no delimiters? – E.E.33 Feb 04 '13 at 19:05
8

Or you simply use Date.parse "some random date".
You'll get an ArgumentException if it fails parsing (=> Date is invalid).

See e.g. http://santoro.tk/mirror/ruby-core/classes/Date.html#M000644

Simon Woker
  • 4,994
  • 1
  • 27
  • 41
  • This is beautiful, thank you for saving me mucho time with a regex scalpel. – bobmagoo Dec 18 '12 at 23:43
  • 2
    Careful if you have unvalidated strings. Using a zip code for example has undesired results. `DateTime.parse('60201-4286').to_s` gives `"2060-07-19T00:00:00+00:00"` and does not fail. – dev_row Feb 27 '17 at 23:21
  • @Simon-Woker : I vote -1 to your answer because the question asked a regular expression and NOT a solution to validate the date So, sorry but, for me, it could be a good answer – Filippo1980 Aug 01 '18 at 10:18
  • `DateTime.parse('monoxide')` ` => # ` – Jan Krupa Nov 25 '19 at 12:48
4

The best you can do with a regexp is to validate the format, e.g. something like:

[0-1][0-9]/[0-3][0-9]/[0-9]{2}(?:[0-9]{2})?

Anything beyond that cannot be reliably done without some kind of date dictionary. A date's validity depends on whether it's a leap year or not, for instance.

Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • As shown by other answers, this is simply not true. Although the regex becomes ugly and unwieldy, you can match validity to some arbitrary level of correctness. – Phrogz May 12 '11 at 17:16
  • 2
    @Phrogz. It is practically true. The accepted answer is wrong as M42 notices; it does not handle leap years correctly. In order to do it, it has to incorporate the information about the switch to Gregorian and so on. The regex, then, will be a mess. – sawa May 12 '11 at 18:09
  • I'm obviously biased, but +1 sawa. ;-) – Denis de Bernardy May 12 '11 at 19:57
2

For MM-DD-YYYY you could use the below regex. It'll work for leap years, and will match correct dates only unless the year doesn't exceed 2099.

(?:(09|04|06|11)(\/|-|\.)(0[1-9]|[12]\d|30)(\/|-|\.)((?:19|20)\d\d))|(?:(01|03|05|07|08|10|12)(\/|-|\.)(0[1-9]|[12]\d|3[01])(\/|-|\.)((?:19|20)\d\d))|(?:02(\/|-|\.)(?:(?:(0[1-9]|1\d|2[0-8])(\/|-|\.)((?:19|20)\d\d))|(?:(29)(\/|-|\.)((?:(?:19|20)(?:04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))|2000))))

Checkout matches in http://regexr.com/

piet.t
  • 11,718
  • 21
  • 43
  • 52
Jaison Joy
  • 21
  • 2
  • Do we have to trust you on this one or can you explain what this regex does? – luk2302 Jun 13 '17 at 09:59
  • You could go ahead and trust me. :) For the starters, it makes sure that you cannot add a '31' for the months of September, April, June and November. Feb 29 could be added only for leap years, and it works as long as the year is between 1900-2099. Please let me know if this helps. – Jaison Joy Jun 13 '17 at 14:27
0

so you want a regex that will match as mm/dd/yy

^((0?1?1){1}|(0?1?2){1}|([0]?3|4|5|6|7|8|9))\/((0?1?2?3?1){1}|(0?1?2?(2|3|4|5|6|7|8|9|0))|(30))\/[1-90]{4}$

this regex will match exactly what you want in that format mm/dd/yy an will not validate any fake date you can test the regex on regex101 you can test for the dates 12/30/2040 and 09/09/2020 and what ever you want for that format i think this is also the shortest regex you can find for that format

Mofor Emmanuel
  • 199
  • 5
  • 13
0

Here's the code than you can use :), try it and tell me :

^([0-2][0-9]|(3)[0-1])(\/)(((0)[0-9])|((1)[0-2]))(\/)\d{4}$
  • Did you try it? It matches `00/00/0000`, not sure it's a valid date! It also matches `31/02/2000` or `31/06/2018` but not `06/06/18`. Moreover OP wants date format as `mm/dd/yy` or `mm/dd/yyyy` – Toto Jun 16 '18 at 16:18