2

I've to format a date into a specific format ^\\d{4}-\\d{2}-\\d{2}$. I'm searching now for a while and can't find a solution. I found these \\d{3}-\\d{4}, \d that nearly matches my requirements but I can't figure out why it doesn't match. I'm testing with regex101

As far I understand it, the first entry should work.

2021-11-10
"1234-12-34"

I also tried to copy characters from an Ascii-table to ensure they are not special.

heaxyh
  • 573
  • 6
  • 20
  • In regex101 you should unescape the slashes (replace `\\d` with `\d`), otherwise the pattern matches '\' and 'd' characters instead of numbers. – Cray Nov 02 '21 at 07:19
  • Try to escape the minus sign, it is considered a range operator. That is, your regex should look like this `\d{4}\-\d{2}\-\d{2}`. This regex on the other hand matches any lower letter from a to z `[a-z]`. –  Nov 02 '21 at 07:19

2 Answers2

1

Try to remove the ^ and the $

\\d{4}-\\d{2}-\\d{2}

^ matches the start of line and $ matches the end of line. So the second entry has " after the start and before the end, it doesn't match.

Depending on your language, no need to double backslashes (\\):

\d{4}-\d{2}-\d{2}

Doubling backslash is used for escaping purpose.

hata
  • 11,633
  • 6
  • 46
  • 69
  • There is a line break at the end of line, but I tried it without still doesn't work. – heaxyh Nov 02 '21 at 07:11
  • The second supposed to be another testing example. Sorry for that confusion, I edited it. I tried to remove **^** and **$** with no effect. – heaxyh Nov 02 '21 at 07:18
  • @heaxyh Depending on your language, you can do with single `\\`s. – hata Nov 02 '21 at 07:19
0

Using POSIX Character Classes

You don't mention a language, so I'm not sure why you're escaping your backslash characters. If you're looking for a portable solution, you can do this with POSIX character classes rather than the \d atom which (while fairly common these days) is certainly not universal.

For example, this anchored expression:

    ^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}$

will match with any extended regex engine I know, including egrep, pcregrep, Ruby, GNU awk, GNU sed (with the -r flag), and others. As an example:

    $ echo "2021-11-10" | egrep '^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}$'
    2021-11-10

Caveats

  1. It won't work with engines that don't understand the {} length atom (e.g. BSD grep without the -E flag).

  2. It will validate the format, but won't actually ensure that it's a valid date. For that, you need a tool that understands dates, such as GNU date. For example:

     $ date -d '2021-11-10'
     Wed Nov 10 00:00:00 EST 2021
    
     $ date -d '1234-12-34'
     date: invalid date ‘1234-12-34’
    
  3. Regular expressions are powerful, but they aren't always the right solution to every problem, especially if the problem is one of data validation.

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
  • thanks for your deep explanation, but I can't tell which regex interpreter they use. I got it from an API, where they keep their language secret. – heaxyh Nov 02 '21 at 07:46