2

I'm looking for a line in bash that would work on both linux as well as OS X to remove the second line containing the desired string:

Header
1
2
...
Header
10
11
...

Should become

Header
1
2
...
10
11
...

My first attempt was using the deletion option of sed:

sed -i '/^Header.*/d' file.txt

But well, that removes the first occurence as well.

How to delete the matching pattern from given occurrence suggests to use something like this:

sed -i '/^Header.*/{2,$d} file.txt

But on OS X that gives the error

sed: 1: "/^Header.*/{2,$d}": extra characters at the end of d command

Next, i tried substitution, where I know how to use 2,$, and subsequent empty line deletion:

sed -i '2,$s/^Header.*//' file.txt
sed -i '/^\s*$/d' file.txt

This works on Linux, but on OS X, as mentioned here sed command with -i option failing on Mac, but works on Linux , you'd have to use

sed -i '' '2,$s/^Header.*//' file.txt
sed -i '' '/^\s*$/d' file.txt

And this one in return doesn't work on Linux.

My question then, isn't there a simple way to make this work in any Bash? Doesn't have to be sed, but should be as shell independent as possible and i need to modify the file itself.

Community
  • 1
  • 1
Anonymous
  • 183
  • 1
  • 9

3 Answers3

3

Since this is file-dependent and not line-dependent, awk can be a better tool.

Just keep a counter on how many times this happened:

awk -v patt="Header" '$0 == patt && ++f==2 {next} 1' file

This skips the line that matches exactly the given pattern and does it for the second time. On the rest of lines, it prints normally.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • It doesn't work when useing patt="Header.*" and also it goes to stdout, while I need to modify the file itself. How would it have to be modified to achieve this? – Anonymous Aug 04 '15 at 17:09
  • Hardcode then `/Header.*/`. To do in place editing, `awk ... file > tmpfile && mv tmpfile file` – fedorqui Aug 04 '15 at 19:29
  • 1
    @Max Adding `.*` to a regexp comparison is useless as it matches zero or more occurrences of any character so it will match exactly the same lines as just `Header` alone. In this case it MAY be overkill but if you like it you could keep the variable and just change `==` to `~` to do a regexp instead of string comparison to find `Header` in a line as opposed to when it's the whole line. – Ed Morton Aug 04 '15 at 21:10
1

I would recommend using awk for this:

awk '!/^Header/ || !f++' file

This prints all lines that don't start with "Header". Short-circuit evaluation means that if the left hand side of the || is true, the right hand side isn't evaluated. If the line does start with Header, the second part !f++ is only true once.

$ cat file
baseball
Header and some other stuff
aardvark
Header for the second time and some other stuff
orange
$ awk '!/^Header/ || !f++' file
baseball
Header and some other stuff
aardvark
orange
Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
1

This might work for you (GNU sed):

sed -i '1b;/^Header/d' file

Ignore the first line and then remove any occurrence of a line beginning with Header.

To remove subsequent occurrences of the first line regardless of the string, use:

sed -ri '1h;1b;G;/^(.*)\n\1$/!P;d' file
potong
  • 55,640
  • 6
  • 51
  • 83