I have a lot of simple Strings like this one:
"amount: 134.707625 delay: 180"
and want to extract those 2 numbers, using the following regular expression: '\d+(\.\d+)?'
It matches both numbers, but the extraction with findall leads to ['.707625', '']
While the semantically identical regexp '\d+\.\d+|\d+'
leads to the desired output ['134.707625', '180']
why do these 2 regexpes behave differently? Here is my testcode:
import re
pattern = re.compile('\d+(\.\d+)?')
print(pattern.findall("amount: 134.707625 delay: 180"))
print(pattern.match('134.707625'))
print(pattern.match('180'))
pattern2 = re.compile("\d+\.\d+|\d+")
print(pattern2.findall("amount: 134.707625 delay: 180"))
print(pattern2.match('134.707625'))
print(pattern2.match('180'))
and here's the corresponding output:
> python temp.py
['.707625', '']
<_sre.SRE_Match object; span=(0, 10), match='134.707625'>
<_sre.SRE_Match object; span=(0, 3), match='180'>
['134.707625', '180']
<_sre.SRE_Match object; span=(0, 10), match='134.707625'>
<_sre.SRE_Match object; span=(0, 3), match='180'>
Im using Python 3.5.2 from the anaconda distribution and Windows 10