I have an XML file that states it's using utf-8. When I open the file in VIM, I see something like
<?xml version="1.0" encoding="UTF-8"?>
<r>
<first-tag>foo</first-tag>
<second-tag>
<a-tag-nested-in-second-tag>some data</a-tag-nested-in-second-tag>
</second-tag>
...
</r>
I'm using Java 1.6.0_41's SAXParser and while consuming this data, the parser basically doesn't see the malformed literals and skips over them or seems to treat the malformed chars as "content" data for second-tag
.
Here's how I'm consuming data,
File f = ...
SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
stream = new FileInputStream(f);
AbstractHandler handler = ...
parser.parse(new InputSource(stream), handler);
Is there a way for SAX to treat the nested escaped XML data as truly XML markup and not merely data as-is for second-tag
?