I have been searching for the answer to this for days. All of the examples of creating custom XML parsers that I can find (e.g. in the docs, or this example, or this question or this question) talk about generating entirely new data (e.g. the depth of the XML, or a CSV equivalent).
However, all I want to do is intercept the parsing, inspect the data, possibly alter it, then let the parser continue.
I tried this, as a toy example:
from xml.etree import ElementTree as ET
class myParser():
def start(self, tag, data):
##### Pretty sure this is wrong, but what should it be? #####
return ET.XMLParser.start(tag, data)
def data(self, data):
return ET.XMLParser.data(data.replace('"', '"'))
def end(self, tag):
return ET.XMLParser.end(tag)
def close(self):
return ET.XMLParser.close()
def parseFile(fileName):
p = ET.XMLParser(target=myParser)
tree = ET.parse(fileName, parser=p)
but I'm getting
TypeError: 'unbound method start() must be called with myParser instance as first argument (got str instance instead)'
I feel like I'm close, but I'm missing some vital piece of the puzzle that I just can't see.