5

I wrote the following command

echo -en 'uno\ndue\n' | sed -E 's/^.*(uno|$)/\1/'

expecting the following output

uno

This is indeed the case with my GNU Sed 4.8.

However, I've verified that BSD Sed outputs



Why is that the case?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Enlico
  • 23,259
  • 6
  • 48
  • 102
  • I'm not sure I would have the same expectations. Regexes are greedy. Because of that, the `.*` should always match the entire line, so that inside the parens matches the end of line. – Tim Roberts Apr 11 '22 at 21:27
  • [This answer](https://stackoverflow.com/a/24276470/3266847) goes in-depth about the differences between various sed implementations. – Benjamin W. Apr 11 '22 at 21:34
  • 3
    Just a guess here: it looks like the GNU ERE regex engine is willing to backtrack farther to find the longer match ("uno"), whereas the BSD regex engine is happy enough to let `.*` consume the whole line, and then capture `($)` the empty string. – glenn jackman Apr 12 '22 at 03:07
  • @TimRoberts, I'm pretty sure _Mastering Regular Expressions_ gives examples of engines where alternation is not greedy nor lazy, but ordered. – Enlico Apr 12 '22 at 06:47
  • `perl` gives empty lines too. I think this depends on implementation, and as linked above, there are plenty of differences between `GNU` and `BSD` – Sundeep Apr 12 '22 at 08:01
  • 1
    @TimRoberts quantifiers in BRE/ERE are not exactly greedy though, longest match wins. For example, `echo 'foo123312baz' | grep -oE 'o[123]+(12baz)?'` gives `o123312baz` whereas you'll get `o123312` with greedy quantifiers like those in PCRE – Sundeep Apr 12 '22 at 08:05

1 Answers1

6

I'd say that BSD's sed is POSIX-compatible only. POSIX specifies support only for basic regular expressions, which have many limitations (e.g., no support for | (alternation) at all, no direct support for + and ?) and different escaping requirements.

BSD sed is default one on MacOS so very first thing on a new system is to get GNU-compatible sed: brew install gsed.

Jarek
  • 782
  • 5
  • 16