I'm getting weird results when I use the re.DOTALL in re.finditer() when using Python 3.6. I don't know if this is the expected operation or If I'm missing something or if its a bug.
CASE 1
I try this version of a string with an embedded newline.
I expect to get 2 matched values back: m1 = 'abc' and m2 = ' de'
import re
result = re.finditer('.*', 'abc\n de', flags=0)
m1 = result.__next__()
# <_sre.SRE_Match object; span=(0, 3), match='abc'>
m2 = result.__next__()
# <_sre.SRE_Match object; span=(3, 3), match=''>
m3 = result.__next__()
# <_sre.SRE_Match object; span=(4, 7), match=' de'>
m4 = result.__next__()
# <_sre.SRE_Match object; span=(7, 7), match=''>
Whats with the match values m2 and m4?
CASE 2
I try this with re.DOTALL, and I expect to get back one match, m1 = 'abc\n de'
result = re.finditer('.*', 'abc\n de', flags=re.DOTALL)
m1 = result.__next__()
# <_sre.SRE_Match object; span=(0, 7), match='abc\n de'>
m2 = result.__next__()
# <_sre.SRE_Match object; span=(7, 7), match=''>
Whats with the extra matches? How do I make the results work as expected?
I want the first case to return ...
m1 = 'abc'
m2 = ' de'
... and the second case to return
m1 = 'abc\n de'
and nothing else.