1

I am trying to use negative look-ahead as per this reply to match numbers not containing the digit 5:

echo "aaa 123467890 3456 bbb" | egrep '[(?!5)[:digit:]]+'

The color output shows that the second number is matched. How do I fix this?

Is there a way with egrep to exclude 5 from the class [:digit:]? (I performed a number of searches, but could not find anything to this end)

AlwaysLearning
  • 7,257
  • 4
  • 33
  • 68
  • 2
    Is there a reason you are not using `[012346789]+`? – lolbas Sep 12 '18 at 16:04
  • 1
    Look around doesn't work within a character class. Use `[0-46-9]+` – Toto Sep 12 '18 at 16:06
  • @lolbas This is just an example to learn a more general approach, which would work with a larger class (e.g. matching a pattern involving a lower-case letter which is not `m`). – AlwaysLearning Sep 12 '18 at 16:06
  • You don't need look-ahead for this. Just add anchors to the expression or `-w`, e.g.: `grep -wE '[0-46-9]+'` – Thor Sep 12 '18 at 16:28
  • maybe you are looking for `\b(?:(?!5)\d)+\b` – Onyambu Sep 12 '18 at 16:29
  • 1
    Wonder if egrep [would even support lookarounds](https://stackoverflow.com/a/10645676/5527985). However you can use eg [`grep -oP '\b[^\D5]+\b'`](https://regex101.com/r/4Ie736/2) or for you [other concern something like this](https://regex101.com/r/4Ie736/1/). – bobble bubble Sep 12 '18 at 16:34

1 Answers1

5

There are two problems with your regex:

  1. egrep (as in POSIX extended regular expressions) does not support look-ahead or look-behind at all.
  2. Even if it did, [(?!5)[:digit:]] is a single character class equivalent to [[:digit:]()!?]. ( doesn't have any special meaning in a character class.

Unfortunately egrep doesn't support negated named character classes either (as in [[:^digit:]]).

That leaves you only two options:

  • Manually compute the set difference and list it explicitly:

    egrep '[0-46-9]+'
    
  • Switch to PCRE and use a Perl-style regex, either

    grep -P '[^\D5]+'
    

    (the trick that uses double negation and set union to compute a set difference: we match any character that is not a non-digit or 5) or

    grep -P '(?:(?!5)\d)+'
    

    (look-ahead version, fixed).

melpomene
  • 84,125
  • 8
  • 85
  • 148