This is an HTML file containing a large number of <section>... </section>
content in an HTML file, which has the following format.
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<section>
<div>
<header><h2>This is a title (RfQVthHm)</h2></header>
More HTML codes...
</div>
</section>
<section>
<div>
<header><h2>This is a title (UaHaZWvm)</h2></header>
More HTML codes...
</div>
</section>
<section>
<div>
<header><h2>This is a title (vxzbXEGq)</h2></header>
More HTML codes...
</div>
</section>
</body>
</html>
I need to extract the second <section>...</section>
content.
This is the expected output.
<section>
<div>
<header><h2>This is a title (UaHaZWvm)</h2></header>
More HTML codes...
</div>
</section>
I noticed that I can look for the UaHaZWvm
character first (and 2 lines ahead) until I encounter the next </section>
.
OP's efforts(mentioned in comments): grep -o "hi.*bye" file
Can this be done with awk
, sed
or grep
tools please?