I'm trying to fetch a text section within a parsed HTML page. The text starts after a pattern ("Item c") that occurs multiple times in the page (i.e.: there are 3 "Item c").
When I run my code I only parse the last occurrence while I would need just the first one.
Here's the HTML structure of the first occurrence and some code I've used to find the beginning and end of the text:
<p>
<font style="display:inline;">Item c. Mike’s bike</font>
</p>...
a <- grep("^Item\\s{0,}c.\\s{0,}M", f.text, ignore.case = TRUE)
b <- grep("^Item\\s{0,}d.\\s{0,}Q", f.text, ignore.case = TRUE)
I tried with the exact match of part of the words but it doesn't always work.
Is there an indexing/more general matching tip I can use?
Thank you in advance
Disclaimer: fairly new with R:)