34

I am exploring capturing groups in Regex and I am getting confused about lack of documentation on it. For ex, can anyone tell me difference between two regex:

/(?:madhur)?/

and

/(madhur)?/

As per me, ? in second suggests matching madhur zero or once in the string.

How is the first different from second ?

Madhur Ahuja
  • 22,211
  • 14
  • 71
  • 124

3 Answers3

35

The first one won't store the capturing group, e.g. $1 will be empty. The ?: prefix makes it a non capturing group. This is usually done for better performance and un-cluttering of back references.

In the second example, the characters in the capturing group will be stored in the backreference $1.

Further Reading.

alex
  • 479,566
  • 201
  • 878
  • 984
  • Why would you want to use non capturing grouping? Like wouldn't the parentheses be redundant in that case? In other words, what is different between: /(?:madhur)?/ and /madhur?/ – Didier A. Jun 06 '13 at 19:53
  • 3
    the reason is to apply a condition to whole text. and no those two aren't the same. 1st is madhur is optional in 2nd only r is optional. – Muhammad Umer Sep 02 '13 at 16:03
  • @alex... why capture group results in different outcomes when used in match or split. Ex:, `" , ".match(/(\s+)?,(\s+)?/)` results in **[","," "," "]** while `" , ".match(/(\s+)?,(\s+)?/g)` or `" , ".match(/[\s+]?,[\s+]?/)` results in **[","]**. Can you explain why – Muhammad Umer Sep 02 '13 at 16:06
  • 1
    @MuhammadUmer Adding `g` changes how matches are returned with `match()` if you have capturing groups. – alex Sep 02 '13 at 23:58
  • i know i just learned...http://stackoverflow.com/questions/18577704/why-capturing-group-results-in-double-matches-regex – Muhammad Umer Sep 05 '13 at 03:53
20

Here's the most obvious example:

"madhur".replace(/(madhur)?/, "$1 ahuja");   // returns "madhur ahuja"
"madhur".replace(/(?:madhur)?/, "$1 ahuja"); // returns "$1 ahuja"

Backreferences are stored in order such that the first match can be recalled with $1, the second with $2, etc. If you capture a match (i.e. (...) instead of (?:...)), you can use these, and if you don't then there's nothing special. As another example, consider the following:

/(mad)hur/.exec("madhur");   // returns an array ["madhur", "mad"]
/(?:mad)hur/.exec("madhur"); // returns an array ["madhur"]
brymck
  • 7,555
  • 28
  • 31
6

It doesn't affect the matching at all.

It tells the regex engine

  • not to store the group contents for use (as $1, $2, ...) by the replace() method
  • not to return it in the return array of the exec() method and
  • not to count it as a backreference (\1, \2, etc.)
AndreKR
  • 32,613
  • 18
  • 106
  • 168
  • 1
    One minor nit: It will change the matching in some cases. E.g. in `/(foo)\1/` will match `"foofoo"`, but `/(?:foo)\1/` will not. The `\1` is interpreted as a back-reference in the first, and as an octal escape sequence in the second. – Mike Samuel Jun 21 '11 at 00:27
  • why these two are different `" , ".match(/(\s+)?,(\s+)?/)` and `" , ".match(/[\s+]?,[\s+]?/)` they output different arrays. – Muhammad Umer Sep 02 '13 at 16:15
  • One uses a group that says "one or more whitespaces or none at all" and the other one uses a character class that says "a whitespace or a plus or nothing at all". – AndreKR Sep 02 '13 at 17:58