Weird behavior with String#match() capturing groups

Question

Problem: I have a string, e.g.: "to see to be to read" and I'd like to capture the 3 verbs without the "to " prefix, in this case: be, see and read.

On Regex 101, I've tried this really simple regex and it solved the problem:

Regex: /to (\w+)/g
Result: ['be', 'see', 'read']

Just for curiosity, I've made this another regex, using positive lookahead, and the results were the same.

Regex: /(?=to \w+)\w+ (\w+)/g
Result: ['be', 'see', 'read']

Okay. Weird thing is: When I'm running this regex on Browser Console (either Chrome or Firefox), the results are different. The two following tries gives me the same results: all three groups including the to prefix.

> 'to be to see to read'.match(/to (\w+)/g)
  ["to be", "to see", "to read"]

> 'to be to see to read'.match(/(?=to \w+)\w+ (\w+)/g)
  ["to be", "to see", "to read"]

Am I missing something here or am I stepping on a bug?

Disclaimer: This is not homework, I'm just validating this for a bigger problem. I'm not a regex expert but know a thing or two about it.

EDIT: I think I was fooled by Regex101. The code sample it gave me showed the String#match() approach, but this function doesn't exclude regexp groups accordingly on the resulting groups. Looping over RegExp#exec() matches is the way to go!

score 1 · Accepted Answer · answered Jan 21 '14 at 18:28

1

Correct way to capture groups in Javascript is using RegExp#exec method in a while loop:

var re = /to (\w+)/g,
    matches = [],
    input = "to see to be to read";
while (match = re.exec(input))
   matches.push(match[1]);

console.log(matches);
//=> ["see", "be", "read"]

answered Jan 21 '14 at 18:28

anubhava

761,203
64
569
643

Hmmm... I see. So I think what's left on the question is: Why isn't `String#match()` working as expected? – everton Jan 21 '14 at 18:31
1

It is working as expected, it's just returning the substring that satisfies the match. Like anubhava said, `.exec()` method is the way you retrieve capture groups. – tenub Jan 21 '14 at 18:32
@EvertonAgner: `String#match` doesn't return all the captured groups like `preg_match_all` of PHP. See this Q&A also: http://stackoverflow.com/questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regex – anubhava Jan 21 '14 at 18:34
Oh I see, I was relying on that. Thanks! – everton Jan 21 '14 at 18:37

Weird behavior with String#match() capturing groups

1 Answers1