Usually, it makes sense to match a string that can only match the string containing allowed blocks and then use some programming means to do the rest of the "counting" work (you just check how many mm
, dd
, or yyyy
/ yy
there are).
If you have to use a regex, there are two approaches.
Solution #1: Enumerating all alternatives
It is the least comfortable, not dynamic/unscalable solution where you just collect all possible pattern inside a single group:
^(?:
[dD]{2}[_-]?[mM]{2}[_-]?[yY]{2}(?:[yY]{2})? |
[mM]{2}[_-]?[dD]{2}[_-]?[yY]{2}(?:[yY]{2})? |
[mM]{2}[_-]?[yY]{2}(?:[yY]{2})?[_-]?[dD]{2} |
[dD]{2}[_-]?[yY]{2}(?:[yY]{2})?[_-]?[mM]{2} |
[yY]{2}(?:[yY]{2})?[_-]?[dD]{2}[_-]?[mM]{2} |
[yY]{2}(?:[yY]{2})?[_-]?[mM]{2}[_-]?[dD]{2}
)$
See the regex demo. ^
asserts the position in the start of the string, (?:...|...)
non-capturing group with the alternatives and $
asserts the end of string.
Solution #2: Dynamic approach
This approach means matching a string that only consists of three D
, M
, or Y
blocks and restricting the pattern with positive lookaheads that will require the string to only contain a single occurrence of each block. The bottleneck and the problem is that the blocks are multi-character strings, and thus you need to use a tempered greedy token (or unwrap it, making the regex even more monstrous):
^
(?=(?:(?![mM]{2}).)*[mM]{2}(?:(?![mM]{2}).)*$)
(?=(?:(?![dD]{2}).)*[dD]{2}(?:(?![dD]{2}).)*$)
(?=(?:(?![yY]{2}(?:[yY]{2})?).)*[yY]{2}(?:[yY]{2})?(?:(?![yY]{2}(?:[yY]{2})?).)*$)
(?:
(?:[mM]{2}|[dD]{2}|[yY]{2}(?:[yY]{2})?)
(?:[_-](?!$))?
){3}
$
See the regex demo
So, here, the (?:[mM]{2}|[dD]{2}|[yY]{2}(?:[yY]{2})?)(?:[_-](?!$))?
parts repeats 3 times from start to end, so, the string can contain three occurrences of d
, y
or m
, even if they are the same (mmmmmm
will match, too). The lookaheads are all in the form of (?=(?:(?!BLOCK).)*BLOCK(?:(?!BLOCK).)*$)
- matches only if there is any text but BLOCK, then a BLOCK and then any text but BLOCK till the end of the string.