0

I am getting characters like this while executing the web service from soap-ui:

凯&#40857

How can I decode them ?
The exact value should be in Chinese language and value will be 凯龙 in place of these encoded character.

Rao
  • 20,781
  • 11
  • 57
  • 77
Shrikant Gupta
  • 131
  • 1
  • 11
  • Those are HTML entities; I'm sure there's a library, or code samples, somewhere, to do that for you... Also, you are "getting characters", how? In a string? – fge Feb 19 '16 at 08:58
  • Yes, In a string i am getting these values..? I am not sure much about either this is HTML entities or something else. my expected result should be that Chinese value what i mentioned above. using java code i have to code inside function in java file. – Shrikant Gupta Feb 19 '16 at 09:01
  • Well then, grab the numbers behind `` and `;` (those are code points; see [here](http://www.fileformat.info/info/unicode/char/20975/index.htm) for the first of them) and use `Character.toChars()`. – fge Feb 19 '16 at 09:09
  • `new String(new char[]{20975,40857})` is `"凯龙"`. –  Feb 19 '16 at 09:16
  • Have a look at this SO answer [how-to-decode-html-character-entities-in-java](http://stackoverflow.com/questions/994331/java-how-to-decode-html-character-entities-in-java-like-httputility-htmldecode#994339) – SubOptimal Feb 19 '16 at 09:24
  • In what context are you getting these? – Raedwald Feb 19 '16 at 09:33

1 Answers1

0

This is actually normal. Good parser suppose to understand it. For example if you open this xml in a browser you will see as shown below.

enter image description here

Some security libraries like ESAPI from owasp to strict about escaping rules and do that coding. Meanwhile w3.org don't recommend to use it since you can't read such message in simple text editor and some other cases.

Some information in here

Simple way to unescape is to use apache commons library class StringEscapeUtil

Rao
  • 20,781
  • 11
  • 57
  • 77
simar
  • 1,782
  • 3
  • 16
  • 33