Trying to use the ElementTree to parse xml files; since by default the parser does not retain comments, used the following code from http://bugs.python.org/issue8277:
import xml.etree.ElementTree as etree
class CommentedTreeBuilder(etree.TreeBuilder):
"""A TreeBuilder subclass that retains comments."""
def comment(self, data):
self.start(etree.Comment, {})
self.data(data)
self.end(etree.Comment)
parser = etree.XMLParser(target = CommentedTreeBuilder())
The above is in documents.py. Tested with:
class TestDocument(unittest.TestCase):
def setUp(self):
filename = os.path.join(sys.path[0], "data", "facilities.xml")
self.doc = etree.parse(filename, parser = documents.parser)
def testClass(self):
print("Class is {0}.".format(self.doc.__class__.__name__))
#commented out tests.
if __name__ == '__main__':
unittest.main()
This barfs with:
Traceback (most recent call last):
File "/home/goncalo/documents/games/ja2/modding/mods/xml-overhaul/src/scripts/../tests/test_documents.py", line 24, in setUp
self.doc = etree.parse(filename, parser = documents.parser)
File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1242, in parse
tree.parse(source, parser)
File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1726, in parse
parser.feed(data)
IndexError: pop from empty stack
What am I doing wrong? By the way, the xml in the file is valid (as checked by an independent program) and in utf-8 encoding.
note(s):
- using Python 3.3. In Kubuntu 13.04, just in case it is relevant. I make sure to use "python3" (and not just "python") to run the test scripts.
edit: here is the sample xml file used; it is very small (let's see if I can get the formatting right):
<?xml version="1.0" encoding="utf-8"?>
<!-- changes to facilities.xml by G. Rodrigues: ar overhaul.-->
<SECTORFACILITIES>
<!-- Drassen -->
<!-- Small airport -->
<FACILITY>
<SectorGrid>B13</SectorGrid>
<FacilityType>4</FacilityType>
<ubHidden>0</ubHidden>
</FACILITY>
</SECTORFACILITIES>