1

I'd like to match either cde or de but prefer cde if present. How can this be done in sed? e.g.

$ echo "abcdef" | sed -r 's/^.*(cde|de).*$/\1/' 
de
$ echo "abcdef" | sed -r 's/^.*(de|cde).*$/\1/' 
de
$ echo "abcdef" | sed -r 's/^.*(c?de).*$/\1/' 
de

None of the above worked, since I want to output cde rather than de.

Tim Mak
  • 297
  • 4
  • 10
  • 1
    Does this answer your question? [Non greedy (reluctant) regex matching in sed?](https://stackoverflow.com/questions/1103149/non-greedy-reluctant-regex-matching-in-sed) – Nick Apr 22 '20 at 06:45
  • See also: https://stackoverflow.com/questions/59137763/greedy-behaviour-of-grep – Sundeep Apr 22 '20 at 06:51
  • 2
    This might help: `sed -r 's/^.*[^c](c?de).*$/\1/'` – Cyrus Apr 22 '20 at 06:53
  • 2
    @Cyrus good idea, modified to `sed -E 's/^(.*[^c]|^)(c?de).*$/\2/'` incase there are no characters before the match – Sundeep Apr 22 '20 at 06:58
  • @Sundeep: Yeah, I hadn't considered that case at all. – Cyrus Apr 22 '20 at 07:00

2 Answers2

2

sed doesn't support non-greedy, and quantifiers will match leftmost longest, so optional c in this case isn't possible with regex alone. You could use branch commands for this particular example.

$ # tested with GNU sed, syntax might differ for other implementations
$ # t command here will start next cycle if first s command succeeds
$ # so, the second s command will execute only if first one fails
$ printf 'abdef\nabcdef\n' | sed 's/.*cde.*/cde/; t; s/.*de.*/de/'
de
cde


You could also use grep if it supports -o option (print only matching portions)

$ printf 'abdef\nabcdef\n' | grep -oE 'c?de'
de
cde
Sundeep
  • 23,246
  • 2
  • 28
  • 103
1

Perl supports non-greedy:

echo "abcdef" | perl -p -e 's/^.*?(c?de).*$/\1/'

Output:

cde
Cyrus
  • 84,225
  • 14
  • 89
  • 153