I need help developing a robust regular expression targeted at key value pairs written in fortran syntax that follow the format of:
name = values* name=values* ...
Example
The string:
name="my name is" multipleValues = 0.543 0.754 1.166 multipleValues(2) = 'value' "Value2" 4.76454 100 single(1) = 10 single(2)=3.589 boolean = .True. .F. ! comment to mess things up
Should be split up into:
(name, "my name is"),
(multipleValues, [0.543, 0.754, 1.166])
(multipleValues(2), ['value', "Value2", 4.76454, 100])
(single(1), 10)
(single(2), 3.589)
(boolean, [.True., .F.])
Tried
Using the regex from this question sort of works:
"((?:\"[^\"]*\"|[^=,])*)=((?:\"[^\"]*\"|[^=,])*)"
however it includes all the text after an equals sign in the value list:
>>> re.findall('((?:\"[^\"]*\"|[^=,])*)=((?:\"[^\"]*\"|[^=,])*)', testStr)
[('name', "'my name is' multipleValues "), ('', ' 0.543 0.754 1.166 multipleValues(2) '), ('', " 'value' 'Value2' 4.76454 100 single(1) "), ('', ' 10 single(2)'), ('', '3.589 boolean '), ('', ' .True. .F. ! comment to mess things up')]
Maybe need a look behind?
Note: The solution does not need to be a single expression.