5

As a special requirement, I have been trying to escape " with \" while writing XML using DOM.

Unfortunately, when I write text with Document.createTextNode(TextValue), it outputs \". However, the expected is \"

Details:

Writing Text Value:

    public static boolean setDOMElementValue(Document doc, Element elem, String nodeValue) {
    try {
        elem.appendChild(doc.createTextNode(nodeValue));
        return true;
    } catch (DOMException ex) {
        LOG.log(Level.SEVERE, ex.toString());
        return false;
    }
}

Writing XML:

    public static boolean writeDOMToXML(Document doc, String xmlFilePath) {
    try {
        doc.setXmlStandalone(true);
        // write content into xml file

        // Creating TransformerFactory and Transformer
        Transformer tr = TransformerFactory.newInstance().newTransformer();
        // Setting Transformer's output properties
        tr.setOutputProperty(OutputKeys.INDENT, "yes");
        tr.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
        tr.setOutputProperty(OutputKeys.METHOD, "xml");
        tr.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        tr.setOutputProperty(OutputKeys.STANDALONE, "no");

        // Setting DOMSource and StreamResult
        DOMSource source = new DOMSource(doc);
        File file = new File(xmlFilePath);
        StreamResult result = new StreamResult(new OutputStreamWriter(new FileOutputStream(file)));

        // Transform and Return
        tr.transform(source, result);
        return true;
    } catch (TransformerFactoryConfigurationError | TransformerConfigurationException ex) {
        LOG.log(Level.SEVERE, ex.toString());

        return false;
    } catch (TransformerException | FileNotFoundException ex) {
        LOG.log(Level.SEVERE, ex.toString());
        return false;
    }
}
Indigo
  • 2,887
  • 11
  • 52
  • 83
  • 3
    For better help sooner, post your code as an [SSCCE](http://www.sscce.org) that demonstrates your problem. This allows users to copy/paste and reproduce your issue. – Duncan Jones Nov 07 '13 at 13:59
  • 1
    You can force the `Transformer` not to escape _anything_ using [`transformer.setOutputProperty(OutputKeys.METHOD, "text")`](http://stackoverflow.com/a/3029456/2071828). That would, however, mean that you would need to escape required characters manually... – Boris the Spider Nov 07 '13 at 14:08
  • @Boris the Spider This way sounds good, although, this would only create a text file structure. Can't help in this case. – Indigo Nov 07 '13 at 14:25
  • _escape `"` with `\"`_: What's the use of it? Sounds like you want to transform into HTML, not XML. – Michael Konietzka Nov 07 '13 at 14:35
  • Yes that's HTML, but it is a node text value in XML. Which is supposed to be read as it is later and convert to HTML – Indigo Nov 07 '13 at 14:40
  • Sounds you want to mix HTML and XML. You can transform directly into HTML: `setOutputProperty(OutputKeys.METHOD,"html")` – Michael Konietzka Nov 07 '13 at 14:51
  • @MichaelKonietzka: Yes correct, although, I can't write this file as HTML. Primarily it is an XML and only a few of the nodes have this value with HTML in that. I was able to do this pretty easily using C#, .NET, `XMLWriter` – Indigo Nov 10 '13 at 19:59
  • @Indigo any solution to this ? – Undisputed007 Jul 20 '16 at 07:09

1 Answers1

0

When you build a text node with the DOM, you should simply put in any string in there literally e.g. doc.createTextNode("\""). When you serialize the DOM tree the serializer will take care of escaping any character as needed (but within a text node there is no need to escape a double or single quote, that is only necessary inside an attribute value, depending on the attribute value delimiter).

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • 1
    I think that's understood, the OP says _As a special requirement_. The question is how do you _force_ the transformer to escape quotes in text nodes? – Boris the Spider Nov 07 '13 at 14:02
  • I see, well, I don't think an XSLT 1.0 transformer can be forced to do that. With XSLT 2.0 like Saxon you would need to use the identity transformation with a character map that maps the quote character as needed. – Martin Honnen Nov 07 '13 at 14:09
  • 1
    @Martin Honnen, yes, what Boris the Spider said is correct. This quote escaping is a special requirement. Normally DOM serializer would take care of the standard escapes but this is an exception. – Indigo Nov 07 '13 at 14:10
  • I would prefer to stick to the current code structure that I have, since it will have to be changed extensively for this. – Indigo Nov 07 '13 at 14:14