1

I would like to get the string after a specific keyword.

For example:

import re
def findWholeWord(w):
return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

abc = "<StephenCurry Pro='ThreepointShooter'>MVP1times</StephenCurry>"
if findWholeWord("SeedNumber")(abc):
    dddd = re.search('(?<=ThreepointShooter)(.\w+)', abc)
    mvp = dddd.gorup()
    print (mvp)

    print ("found")
else:
    print ("not found")

I expect the result suppose to be 'MVP1times'.

Is there any better method to find a specific string after keyword ? the result maybe a string, Digit or even mix like the result above.

Thanks for help!

Dsw Wds
  • 482
  • 5
  • 17

2 Answers2

1

You can use look-arounds to get the string surrounded by > and < (assuming this stays consistent):

>>> s = "<StephenCurry Pro='ThreepointShooter'>MVP1times</StephenCurry>"

>>> re.search(r'(?<=\>)[^<]+(?=\<)', s).group(0)
'MVP1times'
heemayl
  • 39,294
  • 7
  • 70
  • 76
0

You can change the regular expressiion to: (?<=ThreepointShooter['|"]>)(.\w+). See it live on http://pythex.org/

I'm not sure what exactly your going to do but you don't even need to use lookbehind expression here.

martin
  • 93,354
  • 25
  • 191
  • 226
  • Thx ! i will try and see – Dsw Wds Feb 29 '16 at 08:52
  • dddd = re.search("""?<=ThreepointShooter['|"]>)(.\w+)""" , abc) how do i avoid the quotation mark? I get none attribute in dddd for this case – Dsw Wds Feb 29 '16 at 09:11
  • originally my coding is: dddd = re.search('(?<=ThreepointShooter)(.\w+)', abc) and your coding (?<=ThreepointShooter['|"]>)(.\w+) inside got one apostrophe.. so how do i avoid it? – Dsw Wds Mar 01 '16 at 01:13