Figuring a very long html page as a string. How to extract a tag with its content? Any long Wikipedia page illustrates the thing
Using a parser like cheerio is excluded for performance reasons. Using any technique that will parse the entire page is excluded too for performance reasons. (like the already existing
answers, please read the question before saying it's a duplicate).
The start position is easily found with indexOf("<div class='selector'>");
The issue is with the end position.
How to find where is the closing </div>
, based on the start tag position? There is a lot of other div inside.