In cases like this I like to use finditer
because the match objects it returns are easier to manipulate than the strings returned by findall
. You can continue to match am/is/are, but also match the rest of the string with a second subgroup, and then extract only that group from the results.
>>> import re
>>> string = 'I am John'
>>> [m.group(2) for m in re.finditer("(am|is|are)(.*)", string)]
[' John']
Based on the structure of your pattern, I'm guessing you only want at most one match out of the string. Consider using re.search
instead of either findall or finditer.
>>> re.search("(am|is|are)(.*)", string).group(2)
' John'
If you're thinking "actually I want to match every instance of a word following am/is/are, not just the first one", that's a problem, because your .*
component will match the entire rest of the string after the first am/is/are. E.g. for the string "I am John and he is Steve"
, it will match ' John and he is Steve'
. If you want John and Steve separately, perhaps you could limit the character class that you want to match. \w
seems sensible:
>>> string = "I am John and he is Steve"
>>> [m.group(2) for m in re.finditer(r"(am|is|are) (\w*)", string)]
['John', 'Steve']