Say I have a big XML dictionary formatted like so:
<entry>
<!-- arbitrary amount of lines -->
<head>SomeWord</head>
<!-- arbitrary amount of lines -->
</entry>
And assume I know that SomeWord is on line 3,026,138. I would like to search backwards from line 3,026,138 up until <entry>
, but I don't know how many lines there are between <entry>
and my target line.
This answer works properly if I use the line number rather than a pattern, as follows
sed '/<entry>/h;//!H;3026138!d;x;q' file
However, this is a somewhat suboptimal solution, as I think sed
is scanning from line 0 and crawling through the file for 3 million lines. This seems wasteful, since I already know which area of the file I want to be working in. All in all it takes about half a second.
Does anyone have a solution that capitalizes on the fact that I am aware of the line number, that uses normal Unix/sh programs that everyone already has (such as grep, awk, sed, and so on)?
Note: please do not suggest I use something like xmllint
. Not only is it extremely slow, but I'd also like this to be a meta-format-agnostic script.