92

Is there a way in Python to access match groups without explicitly creating a match object (or another way to beautify the example below)?

Here is an example to clarify my motivation for the question:

Following Perl code

if    ($statement =~ /I love (\w+)/) {
  print "He loves $1\n";
}
elsif ($statement =~ /Ich liebe (\w+)/) {
  print "Er liebt $1\n";
}
elsif ($statement =~ /Je t\'aime (\w+)/) {
  print "Il aime $1\n";
}

translated into Python

m = re.search("I love (\w+)", statement)
if m:
  print "He loves",m.group(1)
else:
  m = re.search("Ich liebe (\w+)", statement)
  if m:
    print "Er liebt",m.group(1)
  else:
    m = re.search("Je t'aime (\w+)", statement)
    if m:
      print "Il aime",m.group(1)

looks very awkward (if-else-cascade, match object creation).

Lin Du
  • 88,126
  • 95
  • 281
  • 483
Curd
  • 12,169
  • 3
  • 35
  • 49
  • 1
    Duplicate: http://stackoverflow.com/questions/122277/how-do-you-translate-this-regular-expression-idiom-from-perl-into-python – S.Lott Mar 31 '10 at 17:12
  • 3
    Caveat: Python re.match() specifically matches against the beginning of the target. Thus re.match("I love (\w+)", "Oh! How I love thee") would NOT match. You either want to use re.search() or explicitly prefix the regex with appropriate wildcard patterns for re.match(".* I love (\w+)", ...) – Jim Dennis Mar 31 '10 at 17:32
  • @Jim Dennis: thanks to point out; I adapted the python example accordingly – Curd Mar 31 '10 at 19:11
  • @S.Lott: oops, you are right. I didn't see, though I was looking for before posting; nevertheless there are valuable new answers here – Curd Mar 31 '10 at 19:18
  • 3
    Possible duplicate of [How do you translate this regular-expression idiom from Perl into Python?](https://stackoverflow.com/questions/122277/how-do-you-translate-this-regular-expression-idiom-from-perl-into-python) – Brian H. Sep 06 '17 at 13:36

5 Answers5

74

You could create a little class that returns the boolean result of calling match, and retains the matched groups for subsequent retrieval:

import re

class REMatcher(object):
    def __init__(self, matchstring):
        self.matchstring = matchstring

    def match(self,regexp):
        self.rematch = re.match(regexp, self.matchstring)
        return bool(self.rematch)

    def group(self,i):
        return self.rematch.group(i)


for statement in ("I love Mary", 
                  "Ich liebe Margot", 
                  "Je t'aime Marie", 
                  "Te amo Maria"):

    m = REMatcher(statement)

    if m.match(r"I love (\w+)"): 
        print "He loves",m.group(1) 

    elif m.match(r"Ich liebe (\w+)"):
        print "Er liebt",m.group(1) 

    elif m.match(r"Je t'aime (\w+)"):
        print "Il aime",m.group(1) 

    else: 
        print "???"

Update for Python 3 print as a function, and Python 3.8 assignment expressions - no need for a REMatcher class now:

import re

for statement in ("I love Mary",
                  "Ich liebe Margot",
                  "Je t'aime Marie",
                  "Te amo Maria"):

    if m := re.match(r"I love (\w+)", statement):
        print("He loves", m.group(1))

    elif m := re.match(r"Ich liebe (\w+)", statement):
        print("Er liebt", m.group(1))

    elif m := re.match(r"Je t'aime (\w+)", statement):
        print("Il aime", m.group(1))

    else:
        print()
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • 1
    It might be verbose, but you'll put the REMatcher class in a nice module which you'll import whenever needed. You wouldn't ask this question for an issue that won't come up again in the future, would you? – tzot Mar 31 '10 at 22:10
  • 4
    @ΤΖΩΤΖΙΟΥ: I agree; but, why isn't such a class in module re yet? – Curd Apr 01 '10 at 08:34
  • @Curd: because you're the one to bring it up. Thousands of other submitters to the Python code base have lived fine without it, so *why* should there be such a class in the re module? In any case, if you think such functionality belongs to the re module, you're most than welcome to supply a patch. Otherwise, please refrain from asking "why aren't things like I think they should be?" questions, because they are non-productive. – tzot Apr 01 '10 at 12:49
  • 17
    @ΤΖΩΤΖΙΟΥ: I disagree. Beeing satisied by the fact that "thousands of others" didn't consider to introduce it is just silly. How can I be sure that there is no good reason not to have such a class if I don't ask "Why"? I don't see one, but maybe somebody else does and can explain it (and thus give a better insight into the philosophy of Python). Here is an good example that such questions are productive: http://stackoverflow.com/questions/837265/why-is-there-no-operator-in-c-c – Curd Apr 04 '10 at 20:54
  • 1
    “Why” questions are generally productive, but your question falls in the subcategory “Why not how *I* like” (emphasis on “how I like”), which cannot be answered. You consider that such a function/class would be most useful, and then ask why others haven't acted upon it. For a change to occur, the motivated (here: you) has to justify the change to the rest of the community (here: the Python community). It's quite self-centered and non-productive to ask the community why your desired change hasn't already been introduced. – tzot Apr 05 '10 at 08:30
  • Anyway, I already answered your question, but I can rephrase if you need me to: the feature you ask for is easiest to implement, just like the itertools recipes in the documentation, and those are far more generic (and therefore stdlib-worthy) than your desired change. BTW, you might have noticed that *by design* assigments in Python are not expressions, which would have solved your issue. – tzot Apr 05 '10 at 08:36
  • Consider C `if ( a = funcall() )`: using an attribution's result as a condition. This feature is commonly misused and can easily become a "trap" in C code (which is maybe why it has been removed from other languages), BUT it could solve your problem and be a nice add-on with an appropriate operator (other than `=`, of course!). Imagine an operator `=?` , for example. You could write `if m =? re.search(...) : m.group(1)` . Using `elif`, you could end up writing code similar to the PERL fragment you posted. So, maybe the problem is more generic (not related to the re module implementation). – LRMAAX Jun 18 '17 at 04:44
  • you may want to compile the regex in `__init__` – Max Heiber Mar 11 '18 at 15:16
  • The regex is not known at `__init__` time – PaulMcG Mar 11 '18 at 15:54
  • Reminder, if you use python3 but the version is lower than 3.8, you can use `re.search` instead. – LianSheng Jun 11 '21 at 14:53
33

Less efficient, but simpler-looking:

m0 = re.match("I love (\w+)", statement)
m1 = re.match("Ich liebe (\w+)", statement)
m2 = re.match("Je t'aime (\w+)", statement)
if m0:
  print("He loves", m0.group(1))
elif m1:
  print("Er liebt", m1.group(1))
elif m2:
  print("Il aime", m2.group(1))

The problem with the Perl stuff is the implicit updating of some hidden variable. That's simply hard to achieve in Python because you need to have an assignment statement to actually update any variables.

The version with less repetition (and better efficiency) is this:

pats = [
    ("I love (\w+)", "He Loves {0}" ),
    ("Ich liebe (\w+)", "Er Liebe {0}" ),
    ("Je t'aime (\w+)", "Il aime {0}")
 ]
for p1, p3 in pats:
    m = re.match(p1, statement)
    if m:
        print(p3.format(m.group(1)))
        break

A minor variation that some Perl folk prefer:

pats = {
    "I love (\w+)" : "He Loves {0}",
    "Ich liebe (\w+)" : "Er Liebe {0}",
    "Je t'aime (\w+)" : "Il aime {0}",
}
for p1 in pats:
    m = re.match(p1, statement)
    if m:
        print(pats[p1].format(m.group(1)))
        break

This is hardly worth mentioning except it does come up sometimes from Perl programmers.

Neuron
  • 5,141
  • 5
  • 38
  • 59
S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • 4
    @ S.Lott: ok, your solution avoids the if-else-cascade, but at the expenses of doing unneccessary matches (m1 and m2 is not needed if m0 matches); thats why I am not really satisfied with this solution. – Curd Mar 31 '10 at 15:34
  • If the key order in your last variation is significant, be sure to tell the OP to use an OrderedDict. – PaulMcG May 18 '14 at 19:43
22

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can now capture the condition value re.search(pattern, statement) in a variable (let's all it match) in order to both check if it's not None and then re-use it within the body of the condition:

if match := re.search('I love (\w+)', statement):
  print(f'He loves {match.group(1)}')
elif match := re.search("Ich liebe (\w+)", statement):
  print(f'Er liebt {match.group(1)}')
elif match := re.search("Je t'aime (\w+)", statement):
  print(f'Il aime {match.group(1)}')
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
2

this is not a regex solution.

alist={"I love ":""He loves"","Je t'aime ":"Il aime","Ich liebe ":"Er liebt"}
for k in alist.keys():
    if k in statement:
       print alist[k],statement.split(k)[1:]
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
1

You could create a helper function:

def re_match_group(pattern, str, out_groups):
    del out_groups[:]
    result = re.match(pattern, str)
    if result:
        out_groups[:len(result.groups())] = result.groups()
    return result

And then use it like this:

groups = []
if re_match_group("I love (\w+)", statement, groups):
    print "He loves", groups[0]
elif re_match_group("Ich liebe (\w+)", statement, groups):
    print "Er liebt", groups[0]
elif re_match_group("Je t'aime (\w+)", statement, groups):
    print "Il aime", groups[0]

It's a little clunky, but it gets the job done.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589