2

I am using grep with -P (Perl regular expression) to colorize strings:

echo aaa bbb ccc ddd eee | grep --color=always -P "^[^#]*\K(bbb|ccc)|^"

in the above example, I want to colorize the string bbb and ccc. However, grep only colorizes the last one (ccc).

how can I modify my regular expression, so that both strings match and are colorized ?

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Martin Vegter
  • 136
  • 9
  • 32
  • 56
  • Try `bbb|ccc`. What does `#` and BOS anchor `^` have to do with it ? –  Jul 28 '17 at 21:48
  • @sln - I don't want to colorize comments, thus the `#`. What do you mean "try bbb|ccc" ? – Martin Vegter Jul 28 '17 at 21:52
  • 2
    I'd imagine that `grep` colorizes only what it actually matches, not your regex used to match. (It is `ccc` here because `*` is greedy.) – zdim Jul 28 '17 at 21:54
  • Unless grep is not a line parser, it won't give you another match where the last one left off right ? –  Jul 28 '17 at 21:57
  • @sin Interestingly, adding `g` at the end makes it color `bbb`. I'd say it's because if _first_ matches `ccc` (`*` being greedy) and then it goes on to match `bbb`. – zdim Jul 28 '17 at 21:59

2 Answers2

4

Because your regex matches only one alternative: From ^ start until ccc. But you want multiple matches. This could be achieved by chaining matches to start with use of \G anchor.

Further it's needed to make the [^#]* lazy by attaching ? for not skipping over a match.

echo aaa bbb ccc ddd eee | grep --color=always -P "\G[^#]*?\K(?:bbb|ccc)"

enter image description here

And the regex variant for multiline string.

(?:\G|\n)[^#]*?\K(?:bbb|ccc)

See this demo at regex101


A different approach can be the use of pcre verbs (*SKIP)(*F) for skipping anything until eol from #

#.*(*SKIP)(*F)|bbb|ccc

See another demo at regex101

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
1

Another alternative is using a perl command to do the match for you.

echo "aaa bbb ccc ddd eee fff" | perl -ne'print if s/(bbb|eee)/\e[1;35m$1\e[0m/g'
Boncrete
  • 146
  • 1
  • 5