1

I am trying to pattern match and replace first person with second person with Python 2.7.

string = re.sub(r'(\W)I(\W)', '\g<1>you\g<2>',string)
string = re.sub(r'(\W)(me)(\W)', '\g<1>you\g<3>',string)
# but does NOT work
string = re.sub(r'(\W)I|(me)(\W)', '\g<1>you\g<3>',string)

I want to use the last regex, but somehow the capture groups are all messed up and even doing a \g<0> shows strange, irregular matches. I would think that capture group 3 would be the last word boundary, but it doesn't appear to be.

A sample sentence could be: I like candy.

I am not interested very much in the correctness of the replacement (me will never actually be selected since I goes first), but I don't know why the capture groups don't work as I would expect.

Thanks!

Nick Anderson
  • 138
  • 2
  • 11

1 Answers1

2

Try with following regex.

Regex: \b(I|me)\b

Explanation:

  • \b on both sides marks the word boundary.

  • (I|me) matches either I OR me.

Note:- You can make it case insensitive using i flag.

Regex101 Demo

  • That's a way better answer. But do you know why the alternation seems to mess up the capture groups such that capture group 3 is no longer the one I would expect? – Nick Anderson Apr 06 '16 at 10:13
  • @NickAnderson: Since you used `\W` instead of word boundary it fails when `I` is at beginning of string. `(\W)(me)|I(\W)` will work too [Demo](https://regex101.com/r/gS0iA7/2) but using word boundary is safer. –  Apr 06 '16 at 10:19
  • This is true. But comparing: string = re.sub(r'(\W)I|(me)(\W)', '\g<1>you\g<3>',string) and string = re.sub(r'(\W)(I|me)(\W)', '\g<1>you\g<3>',string), why are the groups different? Why does making the capture group nested under the alternation operator change how it works? I saw on the Python docs that you can use <2> to reference a nested level also, so does the meaning change based on context? – Nick Anderson Apr 06 '16 at 10:19
  • It's not about capturing groups. It will work even if you don't use one around `me`. You will have to use for grouping all alteration cases. `(I|me)`. –  Apr 06 '16 at 10:24
  • 1
    Thank you; that's very helpful! – Nick Anderson Apr 06 '16 at 12:33