You could use a regex like ([+-])?\s*(?:(\d+)\s*\*\s*)?([a-z]\w*)
. Here, the first part (the coefficient, including *
) is optional, but only the sign and the actual number will the memorized, as well as the variable. Then, You can convert those to a dict.
>>> import re
>>> s = "- x0 + 2 * x1 + 4 * x2 - 3 * y + z + 8"
>>> p = r"([+-])?\s*(?:(\d+)\s*\*\s*)?([a-z]\w*)"
>>> re.findall(p, s)
[('-', '', 'x0'),
('+', '2', 'x1'),
('+', '4', 'x2'),
('-', '3', 'y'),
('+', '', 'z')]
>>> {v: int(s+(c or '1')) for (s, c, v) in _ }
{'x0': -1, 'x1': 2, 'x2': 4, 'y': -3, 'z': 1}
Breakdown of the regex:
([+-])?\s*
: Optional sign, followed by spaces
(?:(\d+)\s*\*\s*)?
Optional coefficient, followed by *
and spaces; only the actual digits is captured
([a-z]\w*)
: Name of the variable