I'm trying to extract given words from string using regex:
>>> pattern = re.compile(ur'(今天|不错)', re.UNICODE)
>>> print pattern.search(u'今天天气不错').groups()
(u'\u4eca\u5929',)
as you can see, only the first word is matched, what's wrong here?
I'm trying to extract given words from string using regex:
>>> pattern = re.compile(ur'(今天|不错)', re.UNICODE)
>>> print pattern.search(u'今天天气不错').groups()
(u'\u4eca\u5929',)
as you can see, only the first word is matched, what's wrong here?
I think you are looking for re.findall()
>>> print pattern.findall(u'今天天气不错')
[u'\u4eca\u5929', u'\u4e0d\u9519']
The findall()
will return all the matches of the pattern in the string.
Where as the re.search()
will only return the first match in the string:
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance.