In regular expressions, .
means a single character and .*
means any number of characters (0 or more).
When you used w.*m
, by default, python will look for the longest sub-string that starts with w
and ends with m
.
This is called GREEDY MATCH.
To find a smaller sub-string in a string that starts with w
and ends with m
, you have to search NON GREEDILY.
For this, instead of using w.*m
, use w.*?m
.
Because of the ?
operator, python matches the first sub-string that is given by the regular expression.
Technically, ?
Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. example : ab?
will match either a
or ab
.
So, Here, w.*?m
will match minimum number of characters after w
(included) that ends with m
(included).
>>> s = '''I wish I may, I wish I might
... Have a dish of fish tonight.'''
>>>
>>> import re
>>> m = re.search('w.*m', s) #GREEDY SEARCH
>>> print(m.group())
wish I may, I wish I m
>>> m = re.search('w.*?m', s) #NON GREEDY SEARCH
>>> print(m.group())
wish I m
Read more about REGULAR EXPRESSIONS here