179

Is there any way to directly replace all groups using regex syntax?

The normal way:

re.match(r"(?:aaa)(_bbb)", string1).group(1)

But I want to achieve something like this:

re.match(r"(\d.*?)\s(\d.*?)", "(CALL_GROUP_1) (CALL_GROUP_2)")

I want to build the new string instantaneously from the groups the Regex just captured.

mc_kaiser
  • 717
  • 8
  • 18

2 Answers2

308

Have a look at re.sub:

result = re.sub(r"(\d.*?)\s(\d.*?)", r"\1 \2", string1)

This is Python's regex substitution (replace) function. The replacement string can be filled with so-called backreferences (backslash, group number) which are replaced with what was matched by the groups. Groups are counted the same as by the group(...) function, i.e. starting from 1, from left to right, by opening parentheses.

CharlesB
  • 86,532
  • 28
  • 194
  • 218
Martin Ender
  • 43,427
  • 11
  • 90
  • 130
  • 7
    Way more clear than the doc! Did not understand how group was working with this one. They should add such example. – tupui Apr 26 '18 at 21:46
  • it worked from the firsttime, This is a pretty good clear way to explain it. Thanks and can you explain how the sub group should be calling in proper way `(r(r))r((r)((r)r))` kinda situation ? – Rakshitha Muranga Rodrigo Nov 08 '18 at 18:38
  • 2
    @RakshithaMurangaRodrigo The groups are numbered from left to right, going by where they start. So if I insert each group's number right in front of the group, they'd be sorted: `1(r2(r))r3(4(r)5(6(r)r))`. – Martin Ender Dec 07 '18 at 10:21
  • 6
    You can also provide a name for a group using this notation: `(?P)` and then reference them in this way: `\g` . This is the most convenient way IMHO. – Playing With BI Nov 02 '21 at 09:44
  • but what is string1 ? I mean where do you put replacement values ? – Phil Apr 20 '22 at 09:37
  • The `r` prefix solved my problem – Ratul Hasan Jun 10 '22 at 21:18
  • Note: I don't think there is any purpose to your '?' in your regex... You already said (\d.*) which means any decimal digit and 0 or more any characters. * (0 or more) automatically means optional.. – Marshall Jobe Sep 13 '22 at 19:42
  • 1
    @MarshallJobe `?` after `*` does not mean optional but it makes the `*` ungreedy. That said, it's still unnecessary in this case (and probably even a bad idea), but I just reused the exact regex from the question, since the focus was on the substitution. – Martin Ender Sep 16 '22 at 14:49
  • @MartinEnder I see, thanks... forgot about ungreedy – Marshall Jobe Oct 07 '22 at 19:59
76

The accepted answer is perfect. I would add that group reference is probably better achieved by using this syntax:

r"\g<1> \g<2>"

for the replacement string. This way, you work around syntax limitations where a group may be followed by a digit. Again, this is all present in the doc, nothing new, just sometimes difficult to spot at first sight.

Gringo Suave
  • 29,931
  • 6
  • 88
  • 75
benelgiac
  • 941
  • 6
  • 10
  • 2
    If you want to add a number after a group this is the way to go, otherwise, it messes up the number value with the group ordinal. – xpeiro Nov 18 '20 at 12:45