-2

This is not a duplicate.. in my belief. I'm not asking about its usefulness. I ask for clarification of definition, if I may. Instead of downvoting, kindly, explain. Then I'll remove this post if I deem stupid for the rest of the readers.

f=re.match(pattern, str)
pattern= '(?:animal)(?:=)((\w+),)+'
str = 'animal=cat,dog,cat,tiger,dog\nanimal=cat,cat,dog,dog,tiger\nanimal=dog,dog,cat,cat,tiger'

which shows like this

animal=cat,dog,cat,tiger,dog
animal=cat,cat,dog,dog,tiger
animal=dog,dog,cat,cat,tiger

If what is after ?: should be 'A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.'

Why does it still return the string 'animal='?

(Python 3.6.3) f is :

<_sre.SRE_Match object; span=(0, 25), match='animal=cat,dog,cat,tiger,'>
f[0]
'animal=cat,dog,cat,tiger,'
f[1]
'tiger,'
f[2]
'tiger'
dia
  • 431
  • 2
  • 7
  • 22
  • What are you confused about? What result did you expect? – Mad Physicist Dec 04 '18 at 00:13
  • Essentially, a non-capturing group looks for a certain pattern, but doesn't actually include it in the match. – iz_ Dec 04 '18 at 00:13
  • then why is it colored? if a certain term/phrase is to be excluded..there are other features to use, like negative lookahead – dia Dec 04 '18 at 00:13
  • blue is the match, and any other colors on regex101 are capture groups. – emsimpson92 Dec 04 '18 at 00:15
  • 'but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.' meaning..so I tried to substitute `pattern` in `re.match(str, pattern)` with that command and it returns error as `invalid syntax`.. – dia Dec 04 '18 at 00:44

2 Answers2

1

(foo) is a capture group. (?:foo) is a non capture group. (?<foo>bar) is a named capture group, where the name is "foo".

The thing with capture groups is that they can later be referenced by referring to the capture group number, or if it is named, the name of the group. It's helpful for when you're trying to separate the match into chunks.

emsimpson92
  • 1,779
  • 1
  • 9
  • 24
  • oh (other than the labeled example you gave) like \1 \2 that refers to the recent matches? But then...the colored matches in fact doesn't mean anything..? – dia Dec 04 '18 at 00:15
  • My mistake... you are correct. `\1` matches group 1, but when using substitution you'll want to use `$1` to refer to group 1. These things can get confusing. – emsimpson92 Dec 04 '18 at 00:16
  • [Here is an example of how capture groups could be useful](https://regex101.com/r/r6SwUz/1) – emsimpson92 Dec 04 '18 at 00:20
0

When matching a regular expression, anything in the parenthesis () is considered a group. Group 0 is the entire matched string, while groups 1,... are the subgroups identified by () in regular pattern.

import re
rr = '(?:animal)(?:=)((\w+),)+'
mystr="animal=cat,dog,cat,tiger,dog"
res = re.search(rr,mystr)
res.group(0)
res.group(1)

Using https://pythex.org/ you can also test against the groups

Moe
  • 991
  • 2
  • 10
  • 24