How to prevent sed from editing the line containing the first match of a pattern, when a variable is used?

Question

This sed code deletes all lines containing {$line} found withing file.txt:

sed -i "/{$pattern}/d" ./file.txt

Currently, this deletes all of the lines in this sample file.txt when $pattern is set to "cat":

One day, the {cat} said to the {owl}, "What are you eating for breakfast, {owl}?"
The {owl} replied, "I am eating {cereal}. Would you like some, {cat}?"
"Yes, I would," replied the {cat}.
So the {cat} and the {owl} ate {cereal}.

I need to change this, such that it does not delete the first found line containing {$pattern}, but only deletes all subsequent appearances found in later lines:

E.g., suppose file.txt is this:

One day, the {cat} said to the {owl}, "What are you eating for breakfast, {owl}?"
The {owl} replied, "I am eating {cereal}. Would you like some, {cat}?"
"Yes, I would," replied the {cat}.
So the {cat} and the {owl} ate {cereal}.

If $pattern were set to "cat", the output would look like this, because line 1 has the first occurrence of "cat", and all other later lines also have it, so they are all deleted:

One day, the {cat} said to the {owl}, "What are you eating for breakfast, {owl}?"

If $pattern were set to "owl", the output would look like this, because line 1 contains the first occurrence of "owl", and lines 2 and 4 also have instances, which are deleted:

One day, the {cat} said to the {owl}, "What are you eating for breakfast, {owl}?"
"Yes, I would," replied the {cat}.

If $pattern were set to "cereal", the output would look like this with line 4 deleted, because line 4 is the only one with an additional occurrence of "cereal".

One day, the {cat} said to the {owl}, "What are you eating for breakfast, {owl}?"
The {owl} replied, "I am eating {cereal}. Would you like some, {cat}?"
"Yes, I would," replied the {cat}.

The file must not be sorted, as the order of lines is important.
Note that line 1 contains two copies of "owl", but the second appearance on that line is not regarded as the second appearance of "owl".

Is there any way to set sed to not delete the line containing the first appearance of {$patter}, but to only edit all later occurrences of a pattern?

Related? http://stackoverflow.com/questions/17910718/how-to-delete-the-matching-pattern-from-given-occurrence — fedorqui, Mar 07 '14 at 13:23
Yes, that nearly answers, but I find `sed -i "/$line/{2,$d}"` results in `sed: -e expression #1, char 33: unexpected ','`. `{2,$d}` seems incompatible with double-quotes on `sed`, but `$line` does not seem to work when single-quotes are used. — Village, Mar 07 '14 at 13:31
you want to delete each line of text that content $line in it or every occurence of $line in the file (in both case, since 2nd occurence). Your question state the content but the sample state the full line and reply goes that way ? — NeronLeVelu, Mar 07 '14 at 14:00
If "$line" contained the text ".*", what would you want to happen? What if "$line" was set to "her" and you hit a line that contained "there" - is that a matching line or not? Do you want to delete the lines that contain the pattern or onlythe lines that exactly match the pattern or do you want to delete from the pattern to the end of of the line or something else? Please post some representative sample input and expected output showing these and other edge cases that you will care about. — Ed Morton, Mar 07 '14 at 19:05
If "$line" is set to "her", this would only match items marked as "{her}", not anything like "there", or "{there}", but "t{her}e" would be regarded as a match. Note that "{" and "}" are marks I've placed in the text previously to make it easier for this script to find only the correct items and to ignore other text. In other words, it does not need to know what a word boundary is. — Village, Mar 08 '14 at 01:54
It should never do anything to the line containing the first occurrence, even if the second occurrence of the matching pattern is in that same first line. It should only delete those later lines also having appearances. — Village, Mar 08 '14 at 01:56
".*" does not appear anywhere in my input file, and I don't know which choice is best or what kind of answer would benefit others. — Village, Mar 08 '14 at 01:57

potong · Answer 1 · 2014-03-08T09:15:20.523

3

This might work for you (GNU sed):

sed '/{'"$var"'}/{x;//{x;d};x;h}' file

Use $var as switch held in the hold space.

edited Mar 08 '14 at 09:15

answered Mar 07 '14 at 14:48

potong

55,640
6
51
83

davir · Answer 2 · 2014-03-08T11:40:59.043

2

I should have tested this more. potong pointed out correctly in the comments that

This will skip over every $line and delete everything else.

According to http://www.grymoire.com/Unix/Sed.html#uh-35a, you could do:

sed -i '
    /{$line}/,$ {
        /{$line}/n # skip over the line that has "{$line}" on it
        d
    }
' ./file.txt

edited Mar 08 '14 at 11:40

answered Mar 07 '14 at 13:33

davir

912
5
7

1

This will skip over every $line and delete everything else. – potong Mar 08 '14 at 09:29

Adrian Frühwirth · Answer 3 · 2014-03-07T13:39:08.493

2

You need to escape the $d, otherwise it gets treated as a shell variable and expanded by the shell (possibly to an empty string):

sed -i "/{$line}/{2,\$d}" ./file.txt

edited Mar 07 '14 at 13:39

answered Mar 07 '14 at 13:33

Adrian Frühwirth

42,970
10
60
71

I tried running this, but found it seems to be ignoring the `{` and `}` and just erasing any lines with `$line`. – Village Mar 07 '14 at 13:39

score 2 · Accepted Answer · answered Mar 07 '14 at 13:36

2

Alternative using awk:

$ awk "/$line/&&c++ {next} 1" ./file.txt

answered Mar 07 '14 at 13:36

Josh Jolly

11,258
2
39
55

2

You should use variable with `awk` like this: `awk -v var="$line" '$0~var&&c++ {next} 1' ./file.txt` – Jotne Mar 07 '14 at 13:54

score 2 · Answer 5 · answered Mar 07 '14 at 23:29

2

Using GNU sed:

sed -ne "/$line/{2,\$p}" -e "/$line/! {p}" file

answered Mar 07 '14 at 23:29

jaypal singh

74,723
23
102
147

I don't think this does what you think it does. To me it says `if a line contains $line and is between 2 and the end-of-file, print it. Otherwise if it does not contain $line print it.` The OP wants the first occurence of $line printed and the rest supressed. – potong Mar 08 '14 at 09:24

How to prevent sed from editing the line containing the first match of a pattern, when a variable is used?

5 Answers5