9

I want to create a XML where blanks are replaced by  . But the Java-Transformer escapes the Ampersand, so that the output is  

Here is my sample code:

public class Test {

    public static void main(String[] args) {

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document document = builder.newDocument();

        Element element = document.createElement("element");
        element.setTextContent(" ");
        document.appendChild(element);

        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        StreamResult streamResult = new StreamResult(stream);
        transformer.transform(new DOMSource(document), streamResult);
        System.out.println(stream.toString());

    }

}

And this is the output of my sample code:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<element>&amp;#160;</element>

Any ideas to fix or avoid that? thanks a lot!

oleh
  • 820
  • 2
  • 13
  • 28
  • I wonder why you'd want to replace the blanks. You want to explain that? – Wivani Sep 19 '11 at 10:10
  • As I understand he wants a non-breaking blank (what exactly 0xA0 is) instead of an ordinary one – Andrei LED Sep 19 '11 at 10:14
  • the xml that i create is a xsl-fo-xml where i need the blanks for the block-elements. so blanks are necessary for me, because apache-fop seems to ignore leading-blanks. i got the advice elsewhere to replace blanks with this entity. andrei is right. – oleh Sep 19 '11 at 10:15

4 Answers4

7

Use output escaping as follows:

Node disableEscaping = document.createProcessingInstruction(StreamResult.PI_DISABLE_OUTPUT_ESCAPING, "&");
 Element element = document.createElement("element");
 element.setTextContent("&#160;");
 document.appendChild(disableEscaping );
 document.appendChild(element);
Node enableEscaping = document.createProcessingInstruction(StreamResult.PI_ENABLE_OUTPUT_ESCAPING, "&");
document.appendChild(enableEscaping )
Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315
Łukasz Woźniczka
  • 1,625
  • 3
  • 28
  • 51
5

Set the text content directly to the character you want, and the serializer will escape it for you if necessary:

element.setTextContent("\u00A0");
forty-two
  • 12,204
  • 2
  • 26
  • 36
  • This way there's no escaping since 0xA0 isn't a special character for xml. So it may not be what oleh wants. – Andrei LED Sep 19 '11 at 10:12
  • well i have accepted the answer because it worked for me. although you are right, that it was not the answer i truly looked for. – oleh Sep 19 '11 at 10:25
  • @oleh this is what you needed. If you really, realy, really want the non breaking space esacped, then set the encoding to `US-ASCII`, or write your own serializer. – forty-two Sep 19 '11 at 10:40
1

Try to use

element.appendChild (document.createCDATASection ("&#160;"));

instead of

element.setTextContent(...);

You'll get this in your xml: It may work if I understand correctly what you're trying to do.

Andrei LED
  • 2,560
  • 17
  • 21
  • I tried your advice, it do not work for me. the generated character-data `<![CDATA[ ]]>` made me some troubles in my further code, i will look if I can handle this later. – oleh Sep 19 '11 at 10:38
  • After some tries, the result of this way is the same as simple Text-Node. The Text ` ` is being handled as separated characters instead of one entity. so the further processing lead to `$amp;#160;`. – oleh Sep 20 '11 at 07:16
0

As addon to forty-two's answer:

If, like me, you're trying the code in a non-patched Eclipse IDE, you're likely to see some weird A's appearing instead of the non-breaking space. This is because of the encoding of the console in Eclipse not matching Unicode (UTF-8).

Adding -Dfile.encoding=UTF-8 to your eclipse.ini should solve this.

Cheers, Wim

Wivani
  • 2,036
  • 22
  • 28