The input are many files with little differences, eg. h3 can be h2, or can be two
at ending - so want to use (ver1|ver2|ver3), but want to replace only part of match.
regex (which doesn't work)
filedata = re.sub(r"""
\uF0B7<br/>\n # Ě<br/> this means punctation mark
(?P<txt>.*?) # this is text
(?P<end>(?:<br/>)\n|\n<h) # this is versions of endings
""",'\n<li>\g<txt></li>\g<end>',filedata, flags=re.S|re.VERBOSE)
input:
(...)
\uF0B7<br/>
Something1<br/>
\uF0B7<br/>
Something2<br/>
\uF0B7<br/>
Something3
<h3>Next Topic
(...)
unfortunately - (?:<br/>)
doesn't work - <br/>
is in \g<end>
result:
(...)
<li>Something1</li><br/>
<li>Something2</li><br/>
<li>Something3</li>
<h3>Next Topic
(...)
expected result:
(...)
<li>Something1</li>
<li>Something2</li>
<li>Something3</li>
<h3>Next Topic
(...)
(I know, that <li>
requires <ul>
or <dl>
, but this is in other regex)
in source is from previous step – Tomasz Brzezina Mar 26 '16 at 00:28