I have a repeating text output, and I want to capture five groups from each repetition. The pattern stretches across several newlines. I want to get an iterator of tuples. I tried this, but it only seems to capture the last match trying findall returns a list with the last tuple as well:
import re
string = '''-----------------------------------------------------------------------
Selecting top 2 features.
Top features (not sorted): CXVol,CCVol
Total prediction score (mean accuracy): 0.611111
precision recall f1-score support
1 0.62 0.83 0.71 6
2 1.00 0.50 0.67 6
3 0.43 0.50 0.46 6
accuracy 0.61 18
macro avg 0.68 0.61 0.61 18
weighted avg 0.68 0.61 0.61 18
Ranking of other features (sorted): IL10,IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
-----------------------------------------------------------------------
Selecting top 3 features.
Top features (not sorted): CXVol,CCVol,IL10
Total prediction score (mean accuracy): 0.666667
precision recall f1-score support
1 0.60 1.00 0.75 6
2 0.75 0.50 0.60 6
3 0.75 0.50 0.60 6
accuracy 0.67 18
macro avg 0.70 0.67 0.65 18
weighted avg 0.70 0.67 0.65 18
Ranking of other features (sorted): IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
-----------------------------------------------------------------------
Selecting top 4 features.
Top features (not sorted): CXVol,CCVol,IL5,IL10
Total prediction score (mean accuracy): 0.611111
precision recall f1-score support
1 0.60 1.00 0.75 6
2 0.75 0.50 0.60 6
3 0.50 0.33 0.40 6
accuracy 0.61 18
macro avg 0.62 0.61 0.58 18
weighted avg 0.62 0.61 0.58 18
Ranking of other features (sorted): R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
'''
p = re.compile(".*top\s(\d+)\sf"
".*Top.*ed\):\s(\S+)\n"
".*curacy\):\s(\S+)\n"
".*hted\savg\s+(\S+)\s+(\S+)", re.S)
m = p.finditer(string)
[print(x.groups()) for x in m]
#Out ('4', 'CXVol,CCVol,IL5,IL10', '0.611111', '0.62', '0.61')