My Regex-fu is seriously lacking and I can't get my head round it... any help greatly received.
I am looking for a Python way to parse a string that a knarly old piece of software (that I dont have source access to) spits out:
,Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas:,,========2========,,3/5,Continue to focus on increasing performance at the higher levelsPupils’,literacy and numeracy skills across the curriculumStandards,in science throughout the schoolPupils’,numerical reasoning skills
What I want to do is:
(1) Remove all the existing , : = /
characters to form a single contiguous string:
Areas for further improvementThe school’s leaders are rightly seeking to improve the following areas23/5Continue to focus on increasing performance at the higher levelsPupils’literacy and numeracy skills across the curriculumStandardsin science throughout the schoolPupils’numerical reasoning skills
Then preceed each capital letter with a single ,
to allow me then to use the string as a sensible csv input....
,Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas23/5,Continue to focus on increasing performance at the higher levels,Pupils’literacy and numeracy skills across the curriculum,Standardsin science throughout the school,Pupils’numerical reasoning skills
I appreciate this will give me a preceeding , but I can strip that out when I write to file.
Is this possible via a re.sub()
and regex-fu?
(Happy for this to be a two step process - remove existing junk characters and then add in , preceeding capital letters)
Can someone save my regex sanity please?
Cheers