0

sed appears to find and replace from right to left.

for example:

echo "a_b_c_d" | sed 's/.*\(_.*\)/\1/'

outputs

_d

but why doesn't

echo "a_b_c_d" | sed 's/^.*\(_.*\)/\1/'

or

echo "a_b_c_d" | sed 's/.*\(_.*$\)/\1/'

output

_b_c_d

since these do not output _b_c_d how should this be done?

How should sed be used to find on first character and not last character when performing a find and replace?

Inian
  • 80,270
  • 14
  • 142
  • 161
Stuber
  • 447
  • 5
  • 16
  • 1
    `.*` is greedy, so it makes the farthest match possible – Inian Dec 16 '20 at 18:41
  • 1
    `.*` is a greedy one, rather try something like: `echo "a_b_c_d" | sed 's/[^_]*\(_.*\)/\1/'` to get `_b_c_d`. Which has everything before first `_` before matched and rest of first `_` saved into back reference to be used later. – RavinderSingh13 Dec 16 '20 at 18:42
  • 2
    Does this answer your question? [Non greedy (reluctant) regex matching in sed?](https://stackoverflow.com/questions/1103149/non-greedy-reluctant-regex-matching-in-sed) – Mark Reed Dec 16 '20 at 18:42

1 Answers1

1

.* is greedy pattern that matches longest possible substring before matching following pattern, _ in this case. So placing .* before _ makes it match longest possible match before matching last _ in your input.

since these do not output _b_c_d how should this be done?

echo "a_b_c_d" | sed 's/^[^_]*\(_.*$\)/\1/'

_b_c_d

Here [^_]* is negated bracket expression (called character class in modern regex flavors) that matches 0 or more of any character that is not _.

anubhava
  • 761,203
  • 64
  • 569
  • 643