grep(1)
uses POSIX Basic Regular Expressions by default, and POSIX Extended Regular Expressions when used with the -E
option.
In POSIX Regular Expressions non-special characters have undefined behaviour when escaped, ex. \s
, and there is no syntax for non-greedy matching, ex. +?
. Furthermore, in BREs, the +
and |
operators are not available, and parenthesis must be escaped to perform grouping.
The POSIX character classes [[:space:]]
and [[:alnum:]_]
are a portable alternatives to \s
and \w
respectively.
Excluding the next matching character from a repetition can be used to emulate non-greedy matching, ex. [^*]+?\w*:
is equivalent
to [^*[:alnum:]_:]+[[:alnum:]_]*:
.
The given regular expression can be represented as multiple BREs:
grep -e '^[[:space:]]*\*[[:space:]]\{1,\}\[ \][^*[:alnum:]_+]\{1,\}[[:alnum:]_]*:[^*]\{1,\}[[:digit:]]$' \
-e '[^*]\{1,\}\.com\.au$' file1
or an ERE:
grep -E '^[[:space:]]*\*[[:space:]]*\[ \][^*[:alnum:]_:]+[[:alnum:]_]*:[^*]+[[:digit:]]$|[^*]+\.com\.au$' \
file1
Note that the GNU implementation of grep(1)
allows for both short character classes (\s
and \w
) and non-greedy repetition (+?
), as non-portable extensions.