0

I came across a regex that has ?: characters, for example:

(?:\s*)

So this will match 0 or more white space characters, but I can't find anywhere what is the purpose of ?: . I know that if the case is ':?', : would be optional, but what is with ?: ?

Zed
  • 5,683
  • 11
  • 49
  • 81
  • 1
    Did you try to read a tutorial about regex? – Casimir et Hippolyte Jun 08 '15 at 21:10
  • 1
    Searching the actual words will usually yield results `colon question mark regex`. http://stackoverflow.com/questions/3512471/non-capturing-group – chris85 Jun 08 '15 at 21:13
  • @CasimiretHippolyte Please provide me with the link where the sequence ?: is explained, I have a hard time finding it. – Zed Jun 08 '15 at 21:14
  • 1
    You have the wrong approach, instead of trying to decode symbols of a regex pattern one by one, read a tutorial. In this way you will no more spend your time to search. – Casimir et Hippolyte Jun 08 '15 at 21:16

2 Answers2

0

(x) Matches 'x' and remembers the match, as the following example shows. The parentheses are called capturing parentheses.

The '(foo)' and '(bar)' in the pattern /(foo) (bar) \1 \2/ match and remember the first two words in the string "foo bar foo bar". The \1 and \2 in the pattern match the string's last two words. Note that \1, \2, \n are used in the matching part of the regex. In the replacement part of a regex the syntax $1, $2, $n must be used, e.g.: 'bar foo'.replace( /(...) (...)/, '$2 $1' ).

(?:x) Matches 'x' but does not remember the match. The parentheses are called non-capturing parentheses, and let you define subexpressions for regular expression operators to work with. Consider the sample expression /(?:foo){1,2}/. If the expression was /foo{1,2}/, the {1,2} characters would apply only to the last 'o' in 'foo'. With the non-capturing parentheses, the {1,2} applies to the entire word 'foo'.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

frhack
  • 4,862
  • 2
  • 28
  • 25
0

It is the match but don't capture. In regex processing everything within ( ) is considered fair game to be matched and captured. The distinction is the resulting data.

For to use it, one may want to verify that a match pattern exists; but one doesn't want all the material which makes up that match.

Why?

Say there is a rule that a user has to add dashes to a phone number such as 303-555-1234. By using match but don't capture we can require a match with the dashes but we can extract the capture of just saving the numbers into a database.

(\d\d\d)(?:-)(\d\d\d)(?:-)(\d\d\d\d)

So then we require a full match of the above, but when we extract the captures we only get the digits.

Match 0 : 303-555-1234
Match 1 : 303
Match 2 : 555
Match 3 : 1234

I think of it as a way to provide an anchor to a match to make it complete but don't need all the data.

Similar can be found when using ExplicitCapture option to only capture all within ( ) but leave any non parenthesis out of the match.

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122