6

I've just made the switch from Perl to Python and am disappointed by the re module. I'm looking for the equivalent of $1 in Python, or any other special variables in regular expressions. In Perl I would use this: $_ = "<name>Joe</name>"; s/<(.)>(.)<[/](.*)>/$2/; I'm trying to do the same in Python. Thanks!

user1510648
  • 71
  • 1
  • 1
  • 4

2 Answers2

15

You can also use the \2 in the back ref or match group in Python.

Such as this:

>>> re.sub(r'(\w+) (\w+)',r'\2 \1','Joe Bob')
'Bob Joe'

Or named substitutions (a Python innovation later ported to Perl):

>>> re.sub(r'(?P<First>\w+) (?P<Second>\w+)',r'\g<Second> \g<First>','Joe Bob')
'Bob Joe'
>>> ma=re.search(r'(?P<First>\w+) (?P<Second>\w+)','George Bush')
>>> ma.group(1)
'George'
>>> ma.group('Second')
'Bush'

But, admittedly, Python re module is a little weak in comparison to recent Perl's.

For a first class regex module, install the newer regex module. It is scheduled to by part of Python 3.4 and is very good.

dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    An example would be good, or at least mentioning backreferences work with `re.sub()`. Anyways, he isn't really doing substitution, he is extracting a substring (his perl example is arguably bad practice). – jordanm Aug 11 '12 at 02:41
  • `findall` does only that: finds all matches. There is no replacement for what is found natively. You need to use `re.sub` to find and replace... – dawg Sep 17 '19 at 16:46
4

You want the re.MatchObject.group() method.

import re

var = "<name>Joe</name>"
match = re.search(r"<(.)>(.)<[/](.*)>", var)
print match.group(2)

It looks like you are using regex to parse a tag-based markup language such as XML. See the following link on why you should use a parser such as ElementTree instead: https://stackoverflow.com/a/1732454/1032785

Community
  • 1
  • 1
jordanm
  • 33,009
  • 7
  • 61
  • 76