0

I've got some data from dbpedia using jena and since jena's output is based on xml so there are some circumstances that xml characters need to be treated differently like following :

Guns n ' Roses

I just want to know what kind of econding is this? I want decode/encode my input based on above encode(r) with the help of javascript and send it back to a servlet.

(edited post if you remove the space between & and amp you will get the correct character since in stackoverflow I couldn't find a way to do that I decided to put like that!)

kapa
  • 77,694
  • 21
  • 158
  • 175
Vahid Hashemi
  • 5,182
  • 10
  • 58
  • 88

2 Answers2

2

Seems to be XML entity encoding, and a numeric character reference (decimal).

A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the format

You can get some info here: List of XML and HTML character entity references on Wikipedia.

Your character is number 39, being the apostrophe: ', which can also be referenced with a character entity reference: '.

To decode this using Javascript, you could use for example php.js, which has an html_entity_decode() function (note that it depends on get_html_translation_table()).


UPDATE: in reply to your edit: Basically that is the same, the only difference is that it was encoded twice (possibly by mistake). & is the ampersand: &.

kapa
  • 77,694
  • 21
  • 158
  • 175
1

This is an SGML/HTML/XML numeric character entity reference.

In this case for an apostrophe '.

Oded
  • 489,969
  • 99
  • 883
  • 1,009