0

Suggest the following program:

    import java.io.StringReader;

    import javax.xml.transform.OutputKeys;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.stream.StreamResult;
    import javax.xml.transform.stream.StreamSource;

    public class CrDemo {
        public static void main(String[] args) throws Exception {
            final String xml = "<a>foo&#13;\nbar&#13;\n</a>";
            final TransformerFactory tf = TransformerFactory.newInstance();
            final Transformer t = tf.newTransformer();
            t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            t.setOutputProperty(OutputKeys.INDENT, "no");
            t.setOutputProperty(OutputKeys.STANDALONE, "yes");
            t.transform(new StreamSource(new StringReader(xml)), new StreamResult(System.out));
        }
    }

The output looks like this:

    <a>foo&#13;
    bar&#13;
    </a>

Is it possible to prevent the Transformer from escaping CR?

user1774051
  • 843
  • 1
  • 6
  • 18
  • Can you not simply use `\r` instead of ` `? `xml = xml.replace(" ". "\r");` – Joop Eggen Oct 17 '19 at 11:45
  • Why is there CR characters in your text? See [XML Carriage return encoding](https://stackoverflow.com/a/2266166/5221149), which explains that an XML parser would have suppressed any CR in an XML document. Since you explicitly added CR characters in the XML document, the system is correctly *preserving* them by escaping them as ` `, otherwise they would disappear when read back in. If you don't want CR in the generated XML document, don't insert CR characters. – Andreas Oct 17 '19 at 12:13

1 Answers1

0

If the input XML contained literal CR characters, they would be removed during parsing. XML parsers normalize line endings to a single NL character; but this doesn't apply if the CR is escaped as &#13;.

So if a text node contains a CR character, the XSLT processor assumes you have worked hard to put it there and that you really want it, and it therefore outputs it in such a way that it will survive round-tripping where the resulting serialized output is re-processed by an XML parser.

Of course, you can get rid of CR characters in your XSLT code, just as you can get rid of any other characters. But it won't happen automatically.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164