39

Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as &lt;. There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.

Marty Chang
  • 6,269
  • 5
  • 16
  • 25
  • 3
    There is not a native JavaScript facility. JavaScript the programming language does not really have much to do with HTML, goofy APIs on the String prototype notwithstanding. – Pointy Oct 26 '16 at 13:39
  • 1
    @Pointy I think generally speaking you're right. It just feels like since JavaScript is so widely used on the web, and HTML entities are a common feature of web development, something like this would've made its way into the language over the past decade. – Marty Chang Oct 26 '16 at 14:21
  • I think the question would benefit from clearly including the existence of such a function in browsers and nodejs standard library in its scope. – hippietrail Nov 24 '17 at 04:19

5 Answers5

30

A nice function using es6 for escaping html:

const escapeHTML = str => str.replace(/[&<>'"]/g, 
  tag => ({
      '&': '&amp;',
      '<': '&lt;',
      '>': '&gt;',
      "'": '&#39;',
      '"': '&quot;'
    }[tag]));
asafel
  • 713
  • 1
  • 7
  • 14
  • 2
    +1 for being nice and simple and for working without depending on some 3rd-party code ... though the fallback is unnecessary due to the regexp – Thomas Urban Feb 16 '20 at 19:58
6

There is no native function in the JavaScript API that convert ASCII characters to their "html-entities" equivalent. Here is a beginning of a solution and an easy trick that you may like

Community
  • 1
  • 1
A. Gille
  • 912
  • 6
  • 23
  • 1
    Thanks for the answer (inconvenient as it may) that what I want doesn't exist. Can you post a different solution though? Or just remove the solution link? That linked solution neither decodes HTML entities nor handles `&` vs. numeric encoding. – Marty Chang Oct 26 '16 at 14:23
6

Roll Your Own (caveat - use HE instead for most use cases)

For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:

let encode = str => {
  let buf = [];

  for (var i = str.length - 1; i >= 0; i--) {
    buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
  }

  return buf.join('');
}

let decode = str => {
  return str.replace(/&#(\d+);/g, function(match, dec) {
    return String.fromCharCode(dec);
  });
}

Usages:

encode("Hello > © <") // "&#72;&#101;&#108;&#108;&#111;&#32;&#62;&#32;&#169;&#32;&#60;"
decode("Hello &gt; &copy; &#169; &lt;") // "Hello &gt; &copy; © &lt;"

However, you can see this approach has a couple shortcomings:


Use the HE Library (Html Entities)

Usage:

he.encode('foo © bar ≠ baz  qux'); 
// Output : 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

he.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// Output : 'foo © bar ≠ baz  qux'

Related Questions

KyleMit
  • 30,350
  • 66
  • 462
  • 664
4

To unescape HTML entities, Your browser is smart and will do it for you

Way1

_unescape(html: string) :string { 
   const divElement = document.createElement("div");
   divElement.innerHTML = html;
   return divElement.textContent || tmp.innerText || "";
}

Way2

_unescape(html: string) :string {
     let returnText = html;
     returnText = returnText.replace(/&nbsp;/gi, " ");
     returnText = returnText.replace(/&amp;/gi, "&");
     returnText = returnText.replace(/&quot;/gi, `"`);
     returnText = returnText.replace(/&lt;/gi, "<");
     returnText = returnText.replace(/&gt;/gi, ">");
     return returnText;
}

You can also use underscore or lodash's unescape method but this ignores &nbsp; and handles only &amp;, &lt;, &gt;, &quot;, and &#39; characters.

Sunil Garg
  • 14,608
  • 25
  • 132
  • 189
2

The reverse (decode) of the answer (encode) @rasafel provided:

const decodeEscapedHTML = (str) =>
  str.replace(
    /&(\D+);/gi,
    (tag) =>
      ({
        '&amp;': '&',
        '&lt;': '<',
        '&gt;': '>',
        '&#39;': "'",
        '&quot;': '"',
      }[tag]),
  )
Ryan - Llaver
  • 528
  • 4
  • 19