1

I was trying the python regular expression and was trying to get first and last word from the string using ^\w* and \w*$. But I am getting the following results:

>>> re.findall(r'^\w*', 'This is a test string')
['This']

#to get the last word
>>> re.findall(r'\w*$', 'This is a test string')
['string', '']

can someone explain how this regexp works and why i am getting the empty element after string (['string', '']). Note: It works with ^\w+ and \w+$.

Al Imran
  • 882
  • 7
  • 29
  • Try this it will give the first and last word `re.findall(r'^(\w+).*\b(\w+).*?$', 'This is a test string')` – Kunal Mukherjee May 08 '19 at 06:58
  • 2
    A usual thing with regex: `.*` can match an empty string. `\w` only matches word chars, so `\w*` matches 0+ word chars, stops before the end of the string. That is the first match. Then, the second match comes as the regex also matches the empty string at the end. – Wiktor Stribiżew May 08 '19 at 06:59
  • Some may wonder why `$` does not "gobble" the end of string position: because it is a *zero-width assertion*, it does not consume the text (i.e. the regex index remains where it was before trying this pattern, it is a reason why `(?!^)` and `(?<!^)` (lookarounds are zero-width assertions, too) are equal). – Wiktor Stribiżew May 08 '19 at 07:22

0 Answers0