I'm parsing TrackMania's .Gbx replay files. It's mixed with bytecode and XML header part that I'm interested in. I'm trying to extract that part from replay file. For most replays it works just fine. But I encountered specific replay that breaks regex.
import re
string = r'''
<header type="replay" exever="3.3.0" exebuild="2018-02-09_15_48"
title="TMStadium"><map uid="Y48WnfHlw9SkYptpMIVkd0PUpRm"
name="$fffTM$09FProLeague$fff xtasis -$09F GWF$fff2018
" author="w_1r" authorzone="World|Europe|Netherlands|Gelderland"/><desc
envir="Stadium" mood="Day" maptype="TrackMania\Race"
mapstyle="" displaycost="2149" mod="" /><playermodel id="StadiumCar"/><times
best="92373" respawns="1" stuntscore="7"
validable="1"/><checkpoints cur="13" onelap="13"/></header>
'''
header = r'(<header)(.*)(</header>)'
print(re.findall(header, string))
Other parts of file seems like don't matter, since even with hand copied header part, regex doesn't work.
Could anyone help to find what I'm missing?