Using Python.
So basically I have a XML like tag syntax but the tags don't have attributes. So <a>
but not <a value='t'>
. They close regularly with </a>
.
Here is my question. I have something that looks like this:
<al>
1. test
2. test2
test with new line
3. test3
<al>
1. test 4
<al>
2. test 5
3. test 6
4. test 7
</al>
</al>
4. test 8
</al>
And I want to transform it into:
<al>
<li>test</li>
<li> test2</li>
<li> test with new line</li>
<li> test3
<al>
<li> test 4 </li>
<al>
<li> test 5</li>
<li> test 6</li>
<li> test 7</li>
</al>
</li>
</al>
</li>
<li> test 8</li>
</al>
I'm not really looking for a completed solution but rather a push into the right direction. I am just wondering how the folks here would approach the problem. Solely REGEX? write a full custom parser for the attribute-less tag syntax? Hacking up existing XML parsers? etc.
Thanks in advance