I'm learning RegEx in Python and have faced this problem.
Assume I have a variable called s
:
>>>print(repr(s))
'HTML elements include\n\n* headings\n* paragraphs\n* lists\n* links\n* and more\n\nTry it!!!'
I want to match '* headings\n* paragraphs\n* lists\n* links\n* and more\n' part of s (start with *
, end with \n
, happen as much as possible), so my code is:
>>>print(re.findall(r'(\*.+?\n)+', s))
['* and more!\n']
I don't understand why just the last pattern is matched. But when I use re.sub() instead, the whole pattern is replaced.
>>> print(re.sub(r'(\*.+?\n)+', 'text', s))
HTML elements include
text
Try it!!!
This shows that the re.sub() matches the right pattern I want. So I'm really confused why I get this. Thanks for your time.