I've slurped a file into a big string. I wish to parse the string and build up a list of dicts based on jobno. Each job will have a variable number of key/value pairs, in no particular order. The only thing I can count on is a jobno:xxxx pair always denotes the beginning of a new job
python 2.7
import re
bigstr = "jobno: 4859305 jobtype: ASSEMBLY name: BLUEBALLOON color: red jobno: 3995433 name: SNEAKYPETE jobtype: PKG texture: crunchy"
regexJobA = re.compile(r'((\w+):\s(\w+)\s?)', re.DOTALL)
for mo in regexJobA.finditer( bigstr):
keyy, valu = mo.groups():
print keyy + ":" + valu
yields
jobno:4859305
jobtype:ASSEMBLY
name:BLUEBALLOON
color:red
jobno:3995433
jobtype:PKG
texture:crunchy
which I could hammer/file/sand/paint to work. But there must be a more elegant regex that would build up the jobs implicitly, something like
regexJobB = re.compile(r'((jobno):\s(\w+)\s?)((*not_jobno*):\s(\w+)\s?)+', re.DOTALL)
would do the trick. But how to negate the (jobno) group? Or use some lookahead/lookbehind/lookaround cleverness to yield
jobno:4859305 jobtype:ASSEMBLY name:BLUEBALLOON color:red
jobno:3995433 jobtype:PKG texture:crunchy
TIA,
code_warrior