I am parsing some data...looks like this
Fourier analysis for v(1):
No. Harmonics: 20, THD: 24.6928 %, Gridsize: 200, Interpolation Degree: 1
Harmonic Frequency Magnitude Phase Norm. Mag Norm. Phase
-------- --------- --------- ----- --------- -----------
0 0 -1.4108e-005 0 0 0
1 100 1.81678 179.986 1 0
2 200 2.67431e-005 -89.68 1.472e-005 -269.67
3 300 0.374737 179.937 0.206264 -0.049661
4 400 2.57338e-005 -89.357 1.41645e-005 -269.34
5 500 0.185804 179.876 0.102271 -0.1108
6 600 2.46676e-005 -89.033 1.35777e-005 -269.02
7 700 0.112225 179.799 0.0617716 -0.18748
8 800 2.37755e-005 -88.71 1.30866e-005 -268.7
9 900 0.0757484 179.708 0.0416937 -0.27803
10 1000 2.31014e-005 -88.392 1.27156e-005 -268.38
11 1100 0.0558207 179.611 0.0307251 -0.37527
12 1200 2.25406e-005 -88.082 1.24069e-005 -268.07
13 1300 0.0439558 179.513 0.0241943 -0.47325
14 1400 2.19768e-005 -87.779 1.20966e-005 -267.77
15 1500 0.0362049 179.416 0.019928 -0.5704
16 1600 2.13218e-005 -87.483 1.1736e-005 -267.47
17 1700 0.0305653 179.316 0.0168239 -0.67046
18 1800 2.0553e-005 -87.194 1.13128e-005 -267.18
19 1900 0.0260612 179.207 0.0143447 -0.77967
There are several places where we have some float data. I can make a regular expression for the float, here is a part of the re for the line.
(?P<Magnitude>[-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?)
This part [-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?
is pretty complicated, so it would be nice if there was some way to name this and reuse it...sort of like if you could name our own meta-character and replace that with \f+
or something (i.e. if this was the meta-character for float, which it is not...). By they way, I got this from this question.
So I am looking for a good approach for containing this complexity. I could probably use string concatenation or formatting on the pattern string, but I am wondering if there is some better way. Maybe I am missing something obvious.
Here is the unwieldy expression
re.compile(r"^\s*(?P<Harmonic>\d+)\s+(?P<Frequency>\d+)\s+(?P<Magnitude>[-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?)\s+(?P<Phase>[-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?)\s+(?P<NormMag>[-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?)\s+(?P<MormPhase>[-+]?(?:(?:\d*\.\d+)|(?:\d+\.?))(?:[Ee][+-]?\d+)?)\s+$", re.MULTILINE)