-2

I am trying to learn regex and one of the exercise I am trying to solve is as follows:

I have a string:

    "london new york" 

that I am trying match with regex.

and pattern is like this:

    r"(..o(.)).+(\2)*"

Result is ndon new york.

As far as I understand, (\2) matches n but what matches ew york ? Also what does * in (\2)* do? Does it try to match n or special character .?

jdoe
  • 59
  • 10

1 Answers1

-1

Take a look at it like this

 (                             # (1 start)
      . . o
      ( . )                         # (2)
 )                             # (1 end)
 ( .+ )                        # (3)
 ( \2 )*                       # (4)

 **  Grp 0 -  ( pos 2 : len 13 ) 
ndon new york  
 **  Grp 1 -  ( pos 2 : len 4 ) 
ndon  
 **  Grp 2 -  ( pos 5 : len 1 ) 
n  
 **  Grp 3 -  ( pos 6 : len 9 ) 
 new york  
 **  Grp 4 -  NULL 

You can see that group 4 is always empty.
Group 4 will always be empty, because group 3, takes all
the air out of the remainder of the current line, never
leaving anything for Optional Group 4.

You still must be careful when you use quantifier * or + on
capture groups. That is because each iteration of the group,
if matched will contribute to the whole match, however only the
last iteration will be retained by that group because it's
contents are overwritten each time.