When I use xmltodict to load the xml file below I get an error: xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 1
Here is my file:
<?xml version="1.0" encoding="utf-8"?>
<mydocument has="an attribute">
<and>
<many>elements</many>
<many>more elements</many>
</and>
<plus a="complex">
element as well
</plus>
</mydocument>
Source:
import xmltodict
with open('fileTEST.xml') as fd:
xmltodict.parse(fd.read())
I am on Windows 10, using Python 3.6 and xmltodict 0.11.0
If I use ElementTree it works
tree = ET.ElementTree(file='fileTEST.xml')
for elem in tree.iter():
print(elem.tag, elem.attrib)
mydocument {'has': 'an attribute'}
and {}
many {}
many {}
plus {'a': 'complex'}
Note: I might have encountered a new line problem.
Note2: I used Beyond Compare on two different files.
It crashes on the file that is UTF-8 BOM encoded, and works om the UTF-8 file.
UTF-8 BOM is a sequence of bytes (EF BB BF) that allows the reader to identify a file as being encoded in UTF-8.