I have a line like
<charseq START="98" END="139">Low Insulation Resistance (<10 Megohms): 0</charseq>
in part of my data which is in xml format. I use the lxml-xml parser. The <10 is meant to be a mathematical expression obviously and not a begin tag for xml.
So when I do the following code:
soup = BeautifulSoup(data, 'lxml-xml'/)
the document is not processed correctly and the xml data is truncated abruptly at this point. Do I have to manually escape the data myself or is there some argument to pass to BeautifulSoup to handle such cases?
I am expecting a legal xml document in soup such that I can use find()
to access various parts of the document