2

I need to decode html entities such as: &, <, >, ", ` and '.

As recommended in this SO post, I was trying to use _.unescape() from underscore.js for this task.

However, unescape() doesn't seem to have any effect. When I call it e.g. on the following string, it just returns the string itself:

const line = 'Tweag I/O | Paris, France &amp Berlin, Germany | Full-time. Give us a shout at jobs@tweag.io!'

To verify, you can go to JSBin and paste the following code:

const line = 'Tweag I/O | Paris, France &amp Berlin, Germany | Full-time. Give us a shout at jobs@tweag.io!'
console.log(line)

const decodedLine = unescape(line)
console.log(decodedLine)

Don't forget to add the underscore.js library by selecting it from the dropdown that appears when you hit the Add library button.

Update

As noted in @DanPrince's answer, unescape() only decodes a limited set of characters:

&, <, >, ", `, '

But then, changing my line from the example above to the following still doesn;t work (even though this time I use ' and &):

const line = `'Tweag I'O | Paris, France & Berlin, Germany | Full-time. Give us a shout at jobs@tweag.io!'` 

Final Update

I solved my problem by using a different library. Instead of underscore.js, I am now using he which provides exactly the functionality I was looking for.

Now, I can just call decode(line) and all html entities get properly translated. I will be following up on the answers to this question however and accept the answer that explains why unescape() doesn't work as expected.

Community
  • 1
  • 1
nburk
  • 22,409
  • 18
  • 87
  • 132
  • D'oh. Is this as simple as `unescape` vs `_.unescape`? – Dan Prince Apr 01 '16 at 10:02
  • argh... yeah that does work in jsbin! sorry, I'm a js newbie and am never sure about these subtleties! – nburk Apr 01 '16 at 10:05
  • Not at all, I should have spotted it before. The browser includes a native [unescape](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/unescape) function for decoding URI's. – Dan Prince Apr 01 '16 at 10:06

1 Answers1

1

Looking at the source for underscore, everything is translated through the following maps.

var escapeMap = {
  '&': '&',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#x27;',
  '`': '&#x60;'
};
var unescapeMap = _.invert(escapeMap);

The two escaped entities in your string are &#x2F; and &amp, neither of which appear in the escape map. You can fix &amp; by adding a semicolon.

Whilst it's not particularly efficient, you could use the answer suggested here.

Also, I'm getting the expected behaviour when I use _.unescape in jsbin, whereas I think your code uses the native unescape function.

Community
  • 1
  • 1
Dan Prince
  • 29,491
  • 13
  • 89
  • 120
  • thanks for the hint about the mapped characters! but strangely it doesn't work with any of those either... I changed the string I am testing to `'Tweag I'O | Paris, France & Berlin, Germany | Full-time. Give us a shout at jobs@tweag.io!'`. in that version I have the `&` as well as `''`, but in my JSBin it's still not working... – nburk Apr 01 '16 at 09:19