I'm trying to remove stopwords from a string with a single .replace()
because I figured out it brings the best performance in this performance test. But I have problems when two stop words follow each other, like in the snippet below:
var stopWordsRE = /((?:^|\s+?)(foo|bar)(?:$|\s+?))/gi;
var text = "foo bar baz bar foobar";
var filtered = text.replace(stopWordsRE, " ");
console.log(filtered); // bar baz foobar
But it's supposed to return:
baz foobar
The problem is that the regular expression matches foo
and the succeeding whitespace, such that there is no preceding whitespace anymore for bar
to match. I thought the non-capturing groups would suffice, such that the whitespace is not remembered. But apparently not, can you tell me how to fix the regex such that it matches stopwords following each other?