0

I'm trying to code a script that will find a single word or a string composed of multiple single words in a given string. I've found this answer which looks very much what I'd need, but I can't really understand how it works.

Using the code provided in the answer mentioned above, I have this:

import re

def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

st1 = 'those who seek shall find'
st2 = 'swordsmith'

print findWholeWord('seek')(st1)          # -> <match object>
print findWholeWord('these')(st1)         # -> None
print findWholeWord('THOSE')(st1)         # -> <match object>
print findWholeWord('seek shall')(st1)    # -> <match object>
print findWholeWord('word')(st2)          # -> None

This function returns either something like <_sre.SRE_Match object at 0x94393e0> (when the word(s) were found) or None (when they weren't) and I'd like the function to return instead either True or False if the word(s) were found or not, respectively. Since I'm not clear on how the function is working, I'm not sure how I'd do that.

I've never seen a function being called passing two variables (?), ie: findWholeWord(word)(string), what is this doing?

Community
  • 1
  • 1
Gabriel
  • 40,504
  • 73
  • 230
  • 404

2 Answers2

1

re is the regular expression module. findWholeWord creates a regular expression object that will match the word (pattern) you pass it. findWholeWord returns a function; the search method of the regular expression object - notice the absence of the '()' at the end of the return statement.

import re
def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

>>> search = findWholeWord('seek')
>>> print search
<built-in method search of _sre.SRE_Pattern object at 0x032F70A8>
>>>

re.search returns a match object if the pattern is found or None if it is not. match objects evaluate to True.

>>> search = findWholeWord('seek')
>>> print search
<built-in method search of _sre.SRE_Pattern object at 0x032F70A8>
>>> 
>>> match = search('this string contains seek')
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC20> True
>>> match = search('this string does not contain the word you are looking for')
>>> print match, bool(match)
None False
>>>

In your example, findWholeWord('seek')(st1) is calling the search method of a regular expression that matches `seek' and passing it the string st1.

>>> st1 = 'those who seek shall find'
>>> match = search(st1)
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC20> True
>>> match = findWholeWord('seek')(st1)
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC60> True
>>> 
wwii
  • 23,232
  • 7
  • 37
  • 77
  • 1
    Answers this question `I've never seen a function being called passing two variables (?), ie: findWholeWord(word)(string), what is this doing?` – wwii Jun 14 '14 at 19:23
0
if findWholeWord('seek')(st1) == None:
    return False
else:
    return True

Or:

if findWholeWord('seek')(st1): #this is evaluated 'True'
        #do something
else:
        #there is no search match, do something else

Or:

import re

def findWholeWord(w, string):
    pattern = re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE)
    if pattern.search(string):
        return True
    else:
        return False
Reloader
  • 742
  • 11
  • 22
  • First: I didn't downvote (I don't understand users who downvote and don't explain why, so I upvoted your answer). Your first answer would work but it's a bit hackish. The second answer works but I'm still left wondering how is the double `()()` working. Your third answer is more clear to me and it works just fine. Thank you! – Gabriel Jun 14 '14 at 19:05
  • 1
    @Gabriel I downvoted because it didn't provide an answer to your questions and it was just a block of code with no explanation – Tim Jun 14 '14 at 20:42