0

I am trying to read an XML file with xml.etree.ElementTree it gives me an error when it reaches a specific line of the file. I was assuming that this is a regular XML file. please help me how can I resolve it?

xml_file =ET.parse("/Users/arash/project/my_project/extracting_features/TraceData.log").getroot()

It is part of my file which at the line of 56526 it returns me an error said invalid token, not well-formed. Don't know why it is like that the only difference in that line is that it is a new tag with some new attributes added to the file. any help would appreciate

Arash
  • 53
  • 8
  • 1
    Please edit your question and add the actual xml, not an image. – Jack Fleeting Jan 05 '21 at 21:51
  • @JackFleeting thanks, the actual file is very big to trace it is about 12MB. that is why I just sent part of the file as a picture. – Arash Jan 05 '21 at 21:57
  • XML files which are not _well-formed_ are invalid. Sanitize the XML file with another tool before loading it. XML is strict (and not HTML, whose parsers are optimized to make sense of partially invalid files). – zx485 Jan 05 '21 at 22:00
  • @zx485 thanks for your response. Is there any way to make cElementTree understand that ignores that part of the file? because I don't need that part of the XML file at all. the file contains 173370 lines so it is hard to change some parts inside of it. – Arash Jan 05 '21 at 22:41
  • @zx485 this is the tag that makes the cElementTree raise not well-formed error. – Arash Jan 05 '21 at 22:43
  • See https://stackoverflow.com/q/44765194/407651 – mzjn Jan 06 '21 at 18:21

0 Answers0