-1

What does ?: mean when using 'or' in python regex?

e.g

(?:^|\n) does capture the match in say following text

sample text sample text\nsample text sample text

but (^|\n) does not.

What is a reason for that?

Braj
  • 46,415
  • 5
  • 60
  • 76
tural
  • 310
  • 4
  • 17
  • `(^|\n)` captures the start or a new-line character. http://regex101.com/r/gL7lH7/2 – Avinash Raj Aug 21 '14 at 04:48
  • 1
    Did you check the [documentation](https://docs.python.org/2/library/re.html)? It's all explained there, and a quick search would have found it. – user2357112 Aug 21 '14 at 05:00

2 Answers2

4

(?: is a non-capturing group

  (?:                      group, but do not capture:
    ^                        the beginning of the string
   |                        OR
    \n                       '\n' (newline)
  )                        end of grouping

Have a look at online demo

Read more about Capturing

If you do not need the group to capture its match, you can optimize this regular expression into (?:Value). The question mark and the colon after the opening parenthesis are the syntax that creates a non-capturing group.

In other words

(?:^|\n) Non-capturing group

 1st Alternative: ^
    ^ assert position at start of the string
 2nd Alternative: \n
    \n matches a fine-feed (newline) character (ASCII 10)
Braj
  • 46,415
  • 5
  • 60
  • 76
3

(?:) called non-capturing group which does only the matching operation and it won't capture anything.

>>> s = "sample text sample text\nsample text sample text"
>>> print s
sample text sample text
sample text sample text
>>> import re
>>> m = re.findall(r'(?:^|\n)', s, re.M) // In this re.findall function prints the matched characters(ie, two starts and a newline character).
>>> m
['', '\n', '']
>>> m = re.findall(r'(^|\n)', s, re.M) // In this re.findall function prints the captured characters.
>>> m
['', '\n', '']
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274