How to process less than symbols inside text in BeautifulSoup4?

Asked Apr 25 '23 at 16:58

Active Apr 25 '23 at 17:21

Viewed 12 times

I have a line like

<charseq START="98" END="139">Low Insulation Resistance (<10 Megohms): 0</charseq>

in part of my data which is in xml format. I use the lxml-xml parser. The <10 is meant to be a mathematical expression obviously and not a begin tag for xml.

So when I do the following code:

soup = BeautifulSoup(data, 'lxml-xml'/)

the document is not processed correctly and the xml data is truncated abruptly at this point. Do I have to manually escape the data myself or is there some argument to pass to BeautifulSoup to handle such cases?

I am expecting a legal xml document in soup such that I can use find() to access various parts of the document

asked Apr 25 '23 at 16:58

ShastaBiz

How to process less than symbols inside text in BeautifulSoup4?

0 Answers0