I am having a bit of some trouble with this. I tried using Regex but Regex does not give me the exact number of matches I need. I know how many matches I am suppose to have.
I am trying to find all occurrences of a specific word on a .txt
file that is just an html
page in text.
The problem is, the word I am searching for can be an id, class, or just in text
on the website so I need to scrape the entire website for the word.
Also, with Regex, if the word was 'car' Regex was matching it with 'racecar', for example.
I looked into https://jsoup.org/ and is that the best way to go.
Just so I am clear, I watch my method to find, in this example, 'dog', twice in this piece of HTML
<p id="Dog">The dog went for a walk today.</p>
I hope I am clear - this might even be able to be done with Regex but I could have doing it incorrectly. I was using Pattern
and using my pattern as \\bwordToBeSearchedFor\\b
A racecar and just a car
` – Victor Gubin Feb 19 '19 at 13:10