1

I have a line like

<charseq START="98" END="139">Low Insulation Resistance (<10 Megohms): 0</charseq>

in part of my data which is in xml format. I use the lxml-xml parser. The <10 is meant to be a mathematical expression obviously and not a begin tag for xml.

So when I do the following code:

soup = BeautifulSoup(data, 'lxml-xml'/)

the document is not processed correctly and the xml data is truncated abruptly at this point. Do I have to manually escape the data myself or is there some argument to pass to BeautifulSoup to handle such cases?

I am expecting a legal xml document in soup such that I can use find() to access various parts of the document

ShastaBiz
  • 11
  • 3

0 Answers0