I try to implement a regex to read lines such as :
* DCH : 0.80000000 *
* PYR : 100.00000000 *
* Bond ( 1, 0) : 0.80000000 *
* Angle ( 1, 0, 2) : 100.00000000 *
To that end, I wrote the following regex. It works, but I would like to have some feedback about the way to get the integer numbers in parenthesis. On the lines 3 and 4 above, the part with the integers between parenthesis (a kind of tuple of integers) is optional.
I have to define several groups to be able to define that tuple of integer as optional and to manage the fact that that tuple may contain 2, 3 or 4 integers.
In [64]: coord_patt = re.compile(r"\s+(\w+)\s+(\(((\s*\d+),?){2,4}\))?\s+:\s+(\d+.\d+)")
In [65]: line2 = "* Angle ( 1, 0, 2) : 100.00000000 *"
In [66]: m = coord_patt.search(line2)
In [67]: m.groups()
Out[67]: ('Angle', '( 1, 0, 2)', ' 2', ' 2', '100.00000000')
Another example :
In [68]: line = " * Bond ( 1, 0) : 0.80000000 *"
In [69]: m = coord_patt.search(line)
In [71]: m.groups()
Out[71]: ('Bond', '( 1, 0)', ' 0', ' 0', '0.80000000')
As you can see it works, but I do not understand why, in the groups, I got only the last integer and not the each integer separately ? Is there a way to get that integers individually or to avoid to define all that groups and catch only the group 2 which is a string of the tuple which can be easily read otherwise.