How can I parse [u][i][b]sometext[/i][/u]
into <u><i>[b]sometext</i></u>
?
I have some sort of markup which I need to convert into tags. Regex works good until tags can be nested. Is there any library for this in python/django?
How can I parse [u][i][b]sometext[/i][/u]
into <u><i>[b]sometext</i></u>
?
I have some sort of markup which I need to convert into tags. Regex works good until tags can be nested. Is there any library for this in python/django?
Here's an approach that takes advantage of the callback mechanism available in re.sub
. The intuition is to follow a recursive approach when substituting.
Tested on python2.7 and python3.4
import re
s = ... # your text here
def replace(m):
if m:
return '<' + m.group(1) + '>' + re.sub(r"\[(.*?)\](.*?)\[/\1\]", replace, m.group(2), re.DOTALL) + '</' + m.group(1) + '>'
else:
return ''
s = re.sub(r"\[(.*?)\](.*?)\[/\1\]", replace, s, re.DOTALL)
print(s)
[u][i][b]sometext[/i][/u]
Output
<u><i>[b]sometext</i></u>
[u][i][b]sometext[/b][/i][/u]
Output
<u><i><b>sometext</b></i></u>
These are the only two cases I've tried it on, but it should work for most usual cases.