-1

I have a string which contains xml. It has the following substring

<Subject>&amp;#55357;&amp;#56898;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56846;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56843;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56843;&amp;#55357;&amp;#56832;&amp;#55357;&amp;#56846;</subject>    

I'm pulling the xml from a server and I need to display it to the user. I've noticed the ampersand has been escaped and there are utf-16 surrogate pairs. How do I ensure the emojis/emoticons are displayed correctly in a browser.

Currently I'm just getting these characters: �������������� instead of the actual emojis.

I'm looking for a simple way to fix this without any external libraries or any 3rd party code if possible just plain old javascript, html or css.

mtotowamkwe
  • 2,407
  • 2
  • 12
  • 19
  • Possible help: https://stackoverflow.com/questions/47187165/convert-55357-56911-to-emoji-in-html-using-php – Blue Oct 09 '18 at 21:50

1 Answers1

0

You can convert UTF-16 code units including surrogates to a JavaScript string with String.fromCharCode. The following code snippet should give you an idea.

var str = '&amp;#55357;&amp;#56898;ABC&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56846;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56843;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56843;&amp;#55357;&amp;#56832;&amp;#55357;&amp;#56846;';

// Regex matching either a surrogate or a character.
var re = /&amp;#(\d+);|([^&])/g;
var match;
var charCodes = [];

// Find successive matches
while (match = re.exec(str)) {
  if (match[1] != null) {
    // Surrogate
    charCodes.push(match[1]);
  }
  else {
    // Unescaped character (assuming the code point is below 0x10000),
    charCodes.push(match[2].charCodeAt(0));
  }
}

// Create string from UTF-16 code units.
var result = String.fromCharCode.apply(null, charCodes);
console.log(result);
nwellnhof
  • 32,319
  • 7
  • 89
  • 113