I have an XML that I need to edit (1k+ lines) and the changes involve multiple lines in some cases. Then, my solution until this moment has been working the xml as a file opened as a string.
The line I'm looking to change is:
<p class=\"DailyReadingImage\" field=\"heading\"><img src=\"../Final/Images/$1.$2 Spanish.png\" style=\"real-height: 0.25in; real-width: 0.25in; width: 50%\" /></p>
To only be <p />
.
According to regex101 my pattern for looking and replacing should be:
<p( class=\"[^\"]+\")??( style=\"[^\"]+\")??.+<img[^>]+><.p>
This is the current version of the function:
import re
def repLine(k, f, r): # Key, File (a string at this point), Replacement
fi = re.sub(re.compile(k), r, fi)
return fi
But when I try print(repLine(k, f, r))
it doesn't return the string with those changes.
Trying to print matches in order to investigate only returns [('',''), ('',''), ('','')]
(since there's three of these lines I need to replace).
I've tried importing re
and regex as re
with no positive results. Any suggestions?
NOTE: Regex were in PCRE2, I changed them to Python format.
- I've tried
fi = fi.replace(k, r)
- I've also tried creating a bucket with
re.findall()
and loop through it to replace. - Tried going line by line but some lines require multiline search ('\n').