I have a predefined set of words like murder, crime, officer, robbery, culprits, mishap, accident, crash, killed, ....(around 5000 words)
I want to match this words in a news article (approx. 1kb-5kb text) and if found then categorize those words accordingly. Initially I just used spaces before and after words i.e.
if(article.contains(" "+word+" ")) { \*do something*\ }
But this do not work when the word is followed by full-stop, comma or other symbol, same goes for beginning of word
So i switched to regex with word boundaries, but the code now runs 20x slower and CPU usage goes to 100% in 5 threads.
Does anybody have better solution in java? all help is appreciated :)