-1

Very quick and simple question.

Consider the vector of character strings ("AvAv", "AvAvAv")

Why does the pattern (Av)\1([^A]|$) match both strings?

The pattern says have an isntance of "Av", have another, then either have a character that is not an "A" or else come to an end. The first string clearly matches, the latter I do not see how it does. It has two copies of "Av" but then it fails to end (missing the second disjunct), and fails to be followed by a charavter other than "A" (missing the first disjunct), so how does the pattern successfully match it?

Thank you so much for your time and assistance. It is greatly appreciated.

Gordon
  • 1
  • 1

2 Answers2

0

Here is an explanation:

AvAv    - matches (Av)\1$

In this case, we can match Av, followed by that captured quantity, followed by $ from the alternation. In the case of AvAvAv we also have a match:

AvAvAv  - again matches (Av)\1$
  ^^^^  last four letters match

It is the same logic here, except that in order to match, we have to skip the first Av.

If the pattern were ^(Av)\1([^A]|$) then only AvAv would be a match.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

A RegEx only needs to match a part of the string to be considered "a match".

In other words, your RegEx matches this part:

AvAvAv

for the second example.

If you don't want it to match the second one, use a caret ^

^(Av)\1([^A]|$)

In this way the second one won't be matched.

iBug
  • 35,554
  • 7
  • 89
  • 134