73

I want to match an input string (contained in the variable $1) with a regex representing the date formats MM/DD/YYYY and MM-DD-YYYY.

REGEX_DATE="^\d{2}[\/\-]\d{2}[\/\-]\d{4}$"
 
echo "$1" | grep -q $REGEX_DATE
echo $?

The echo $? returns the error code 1 no matter the input string.

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Jérôme G
  • 968
  • 2
  • 7
  • 14
  • 1
    This is possible a kind of duplicate of:http://stackoverflow.com/questions/19737675/shell-script-how-to-extract-string-using-regular-expressions – XsiSecOfficial Mar 10 '16 at 14:25
  • That's because `$?` reports on the first command in the pipe chain, which is echo - the echo will obviously succeed, so you get a `1` exit code. try `grep $pattern <<< $1` instead. – Marc B Mar 10 '16 at 14:26
  • See [this question](http://stackoverflow.com/q/1221833/440558) for one solution. – Some programmer dude Mar 10 '16 at 14:28
  • 1
    Always check your program's documentation to see what style of regular expressions are accepted. – chepner Mar 10 '16 at 14:34
  • 4
    @MarcB err, no, it's the other way around -- `$?` is the **last** exit status in the pipeline – Mike Frysinger Mar 10 '16 at 14:39

3 Answers3

110

To complement the existing helpful answers:

Using Bash's own regex-matching operator, =~, is a faster alternative in this case, given that you're only matching a single value already stored in a variable:

set -- '12-34-5678' # set $1 to sample value

kREGEX_DATE='^[0-9]{2}[-/][0-9]{2}[-/][0-9]{4}$' # note use of [0-9] to avoid \d
[[ $1 =~ $kREGEX_DATE ]]
echo $? # 0 with the sample value, i.e., a successful match

Note, however, that the caveat re using flavor-specific regex constructs such as \d equally applies: While =~ supports EREs (extended regular expressions), it also supports the host platform's specific extension - it's a rare case of Bash's behavior being platform-dependent.

To remain portable (in the context of Bash), stick to the POSIX ERE specification.

Note that =~ even allows you to define capture groups (parenthesized subexpressions) whose matches you can later access through Bash's special ${BASH_REMATCH[@]} array variable.

Further notes:

  • $kREGEX_DATE is used unquoted, which is necessary for the regex to be recognized as such (quoted parts would be treated as literals).

  • While not always necessary, it is advisable to store the regex in a variable first, because Bash has trouble with regex literals containing \.

    • E.g., on Linux, where \< is supported to match word boundaries, [[ 3 =~ \<3 ]] && echo yes doesn't work, but re='\<3'; [[ 3 =~ $re ]] && echo yes does.
  • I've changed variable name REGEX_DATE to kREGEX_DATE (k signaling a (conceptual) constant), so as to ensure that the name isn't an all-uppercase name, because all-uppercase variable names should be avoided to prevent conflicts with special environment and shell variables.

mklement0
  • 382,024
  • 64
  • 607
  • 775
31

I think this is what you want:

REGEX_DATE='^\d{2}[/-]\d{2}[/-]\d{4}$'

echo "$1" | grep -P -q $REGEX_DATE
echo $?

I've used the -P switch to get perl regex.

Chris Lear
  • 6,592
  • 1
  • 18
  • 26
  • 8
    just to clarify, `-P` is not guaranteed to be supported in all distros. so if portability is a concern, you'll want to avoid it. – Mike Frysinger Mar 10 '16 at 14:45
  • In which case, @MikeFrysinger 's solution is preferable. This one has the slight attraction of using the original regex, give or take some escaping. – Chris Lear Mar 10 '16 at 14:47
13

the problem is you're trying to use regex features not supported by grep. namely, your \d won't work. use this instead:

REGEX_DATE="^[[:digit:]]{2}[-/][[:digit:]]{2}[-/][[:digit:]]{4}$"
echo "$1" | grep -qE "${REGEX_DATE}"
echo $?

you need the -E flag to get ERE in order to use {#} style.

Mike Frysinger
  • 2,827
  • 1
  • 21
  • 26
  • 1
    ++; note that if you `\ `-escaped the `{` and `}` instances, this particular regex would have worked without `-E`, as a BRE (basic regex) as well; as a non-portable aside: BSD/OSX `grep` - unlike GNU `grep` - actually does support `\d`. – mklement0 Mar 10 '16 at 14:52
  • 1
    you are certainly correct; however i prefer `-E` over `\` everywhere as it makes the code much more readable, and it's in [POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html). – Mike Frysinger Mar 10 '16 at 17:44