For the simple task of removing two lines if each matches some pattern, all you need to do is:
sed '/<!DOCTYPE.*/{N;/\n<h1.*/d}'
This uses an address matching the first line you want to delete. When the address matches, it executes:
N
ext - append the next line to the current pattern-space (including \n
)
Then, it matches on an address for the contents of the second line (following \n
). If that works it executes:
d
elete - discard current input and start reading next unread line
If d
isn't executed, then both lines will print by default and execution will continue as normal.
To adjust this for three lines, you need only use N
again. If you want to pull in multiple lines until some delimiter is reached, you can use a line-pump, which looks something like this:
/<!DOCTYPE.*/{
:pump
N
/some-regex-to-stop-pump/!b pump
/regex-which-indicates-we-should-delete/d
}
However, writing a full XML parser in sed
or awk
is a Herculean task and you're likely better off using an existing solution.