0

The general format is:

YYYY/MM/DD/INFO
  • Only the separators / are mandatory.
  • Each part is optional.
  • YYYY - exactly 4 numbers.
  • MM - exactly 2 numbers.
  • DD - exactly 2 numbers.
  • INFO - any sequence of letters, spaces or hyphens.

So these are valid strings:

2020/06/25/XYZConf
2020///XYZConf
2020//25/XYZConf
2020/06//XYZConf
//25/XYZConf
///

I'm really struggling to come up with a regex that validates optional parts while maintaining the integrity of the string as a whole.

How would you write this regular expression?

PS: This needs to be a regular expression as it will be part of a third-party lexer that doesn't accept anything else.

Relevant posts:

customcommander
  • 17,580
  • 5
  • 58
  • 84

1 Answers1

1

You could try something like:

^(?:\d{4})?\/(?:(?:\d\d)?\/){2}(?:[A-Za-z\s-]+)?$

See the Online Demo

I believe that you are looking for optional (non)capturing groups. The pattern above matches:

  • ^ - Start string ancor.
  • (?: - Open 1st non-capturing group.
    • \d{4} - Match 4 digits.
    • )? - Close 1st non-capturing group and make it optional.
  • \/ - Match a forward slash.
  • (?: - Open 2nd non-capturing group.
    • (?: - Open 3rd non-capturing group.
      • \d\d - Match two digits.
      • )? - Close 3rd non-capturing group and make it optional.
    • \/ - Match a forward slash.
    • ){2} - Close 2nd non-capturing group and make it match twice.
  • (?: - Open 4th non-capturing group.
  • [A-Za-z\s-]+ - Match upper- and lowercase letters, a space and hyphen at least one time (in any sequence as per your OP).
  • )? - Close 4th non-capturing group and make it optional.
  • $ - End string ancor.
JvdV
  • 70,606
  • 8
  • 39
  • 70