I'm using Woodstox to process an XML that contains some entities (most notably >
) in the value of one of the nodes. To use an extreme example, it's something like this:
<parent> < > & " ' </parent>
I have tried a lot of different configuration options for both WstxInputFactory (IS_REPLACING_ENTITY_REFERENCES
, P_TREAT_CHAR_REFS_AS_ENTS
, P_CUSTOM_INTERNAL_ENTITIES
...) and WstxOutputFactory, but no matter what I try, the output is always something like this:
<parent>nbsp; < nbsp; > & " ' nbsp;</parent>
(>
gets converted to >
, <
stays the same,
loses the &
...)
I'm reading the XML with an XMLEventReader created with
XMLEventReader reader = wstxInputFactory.createXMLEventReader(new StringReader(fulltext));
after configuring the WstxInputFactory.
Is there any way to configure Woodstox to just ignore all entities and output the text exactly as it was in the input String?