1

I am using XML DOM techniques to build a pulldown menu in JavaScript.

After I create an <option> node, I append the text that is supposed to appear for that option. The problem I am facing is that when the text contains character entity references (CERs) such as &#8322; the & character of the CER is being escaped to &amp;, so that the CER and not the character is displayed in the select menu when the menu is outputted to the page for display.

I have tried both of the following methods: optionNode.appendChild(xmlDoc.createTextNode(label)); and

optionNode.textContent = label;

And both give the same result. I can work around the problem by doing a global replace of &amp; with & after I output the XML document to text:

var xml = (new XMLSerializer()).serializeToString(xmlDoc);
return xml.replace(/&amp;/g, '&');

But I'm sure there must be a way to avoid the escaping in the first place. Is there?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131

2 Answers2

0

You could use createCDATASection() instead of createTextNode()

var docu = new DOMParser().parseFromString('<xml></xml>',  "application/xml")
var cdata = docu.createCDATASection('Some <CDATA> data & then some');
docu.getElementsByTagName('xml')[0].appendChild(cdata);

alert(new XMLSerializer().serializeToString(docu));
// Displays: <xml><![CDATA[Some <CDATA> data & then some]]></xml>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • That has the same problem. – Quentin Mar 21 '18 at 20:48
  • Yes, I tried that, but then my string output contains <![CDATA[...]]>, which appears in the pulldown menu, so I am back to doing a global replace. – Robert B. Grossman Mar 23 '18 at 00:48
  • Well, it sounds like you don’t want an XML serialization then if CDATA and entities are a problem. How are you reading or applying the output? It should be “read” and interpreted without issue, unless you are not using XML apis to read. – Mads Hansen Mar 23 '18 at 00:53
  • As I said, I am building a pulldown menu, where the options displayed to the user may contain character entity references. I append the string obtained from the XMLSerializer to a larger string of HTML, and I then use innerHTML to put the larger string (including the pulldown menu) on the page. If the XML output from the serializer contains &#8322; (for example), it appears in the pulldown menu as ₂. If the XML output contains <![CDATA[...]]>, it appears explicitly as well. – Robert Grossman Mar 28 '18 at 21:10
0

I found a solution. Before I create a node containing label, I convert all the character entity references in label to Unicode characters. Then, when I output the xml as a String, I convert all the Unicode characters back to character entity references. Code is adapted from code that I found elsewhere on Stack Overflow.

function cerToUnicode(str) {
    "use strict";
    var entity_table = {
       '&quot;': String.fromCharCode(34), // Quotation mark. Not required
       '&amp;': String.fromCharCode(38), // Ampersand
               '&lt;': String.fromCharCode(60), // Less-than sign
       '&gt;': String.fromCharCode(62), // Greater-than sign
       '&nbsp;': String.fromCharCode(160), // Non-breaking space
       '&iexcl;': String.fromCharCode(161), // Inverted exclamation mark
       ... // other named CERs
   };
   str = str.replace(/&#(\d+);/g,
       function (matched, capture1) {
           return (capture1 == '38' ? '&amp;' : String.fromCharCode(capture1));
       });
   str = str.replace(/&[^;]*;/g,
       function (matched) {
           return entity_table[matched];
       });
   return str;
} // cerToUnicode()

function unicodeToCER(str) {
    return str.replace(/./gm, function(s) {
        var code = s.charCodeAt(0);
        return (code < 128 ? s : "&#" + code + ";");
    });
} // unicodeToCER()