A word boundary \b
does not consume any characters, it is a zero-width assertion, and only asserts the position between a word and non-word chars, and between start of string and a word char and between a word char and end of string.
You need to use \s+
to consume whitespaces between words, and use capturing inside a positive lookahead technique to get overlapping matches:
var n = 2;
var s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";
var re = new RegExp("(?=(\\b\\w+(?:\\s+\\w+){" + (n-1) + "}\\b))", "g");
var res = [], m;
while ((m=re.exec(s)) !== null) { // Iterating through matches
if (m.index === re.lastIndex) { // This is necessary to avoid
re.lastIndex++; // infinite loops with
} // zero-width matches
res.push(m[1]); // Collecting the results (group 1 values)
}
console.log(res);
The final pattern will be built dynamically since you need to pass a variable to the regex, thus you need a RegExp
constructor notation. It will look like
/(?=(\b\w+(?:\s+\w+){1}\b))/g
And it will find all locations in the string that are followed with the following sequence:
\b
- a word boundary
\w+
- 1 or more word chars
(?:\s+\w+){n}
- n
sequences of:
\s+
- 1 or more whitespaces
\w+
- 1 or more word chars
\b
- a trailing word boundary