21

When I got JSON then there are \u003c and \u003e instead of < and >. I want to convert them back to utf-8 in java. any help will be highly appreciated. Thanks.

Pavel Radzivilovsky
  • 18,794
  • 5
  • 57
  • 67
Mubbasher Khaliq
  • 349
  • 1
  • 3
  • 10
  • 3c and 3e *are* `<` and `>`. What do you need to convert anything for? – Hot Licks Jan 31 '12 at 07:02
  • possible duplicate of [How to convert Strings to and from UTF8 byte arrays in Java](http://stackoverflow.com/questions/88838/how-to-convert-strings-to-and-from-utf8-byte-arrays-in-java) – zellus Jan 31 '12 at 07:25
  • 2
    What JSON parser are you using? – McDowell Jan 31 '12 at 08:59
  • Indeed, the correct way to decode JSON string literals is to use a JSON parser. Do not attempt to decode escape sequences yourself because you probably won't get it exactly right. A JSON parser will give you a standard Unicode String object; if you really need to convert that into UTF-8-encoded bytes you can use `getBytes`, but I'm not sure that's really relevant. – bobince Jan 31 '12 at 22:48
  • If you using a `StringEntitiy` you should take a look at this [answer](http://stackoverflow.com/a/6228377/356895). – JJD Aug 14 '12 at 14:30

3 Answers3

13
try {
    // Convert from Unicode to UTF-8
    String string = "\u003c";
    byte[] utf8 = string.getBytes("UTF-8");

    // Convert from UTF-8 to Unicode
    string = new String(utf8, "UTF-8");
} catch (UnsupportedEncodingException e) {
}

refer http://www.exampledepot.com/egs/java.lang/unicodetoutf8.html

Hemant Metalia
  • 29,730
  • 18
  • 72
  • 91
  • I have used this technique also but it is not working. It returns same string which I passes... although it works in test application. Below is what I am using. public static String unicodeToUTF8(String unicodeStr) { // Convert from Unicode to UTF-8 byte[] utf8 = unicodeStr.getBytes("UTF-8"); String UTF8Str=""; UTF8Str = new String(utf8, "UTF-8"); return UTF8Str; } – Mubbasher Khaliq Jan 31 '12 at 09:11
  • it works in test application means there is some problem in application code please check twise the function `unicodeToUTF8`in your application – Hemant Metalia Jan 31 '12 at 09:18
  • I have checked this lots of time and on both test and live applications file.encoding is same i.e. cp1252. What would be the possible options? – Mubbasher Khaliq Jan 31 '12 at 09:22
  • 1
    The reason a test application with `String string = "\u003c"` works is because `\u003c` is a compiler escape just like '\n' is a compiler escape. If you want to test JSON input you have to add an additional level of escaping: `String string = "\\u003c";` And in order to process these you need a library that handles these escapes for you. Your JSON parser should be able to do this. – bames53 Jan 31 '12 at 15:42
2

You can try converting the string into a byte array

byte[] utfString = str.getBytes("UTF-8") ;

and convert that back to a string object by specifying the UTF-8 encoding like

str = new String(utfString,"UTF-8") ;
zellus
  • 9,617
  • 5
  • 39
  • 56
Rocky
  • 941
  • 7
  • 11
0

You can also try this

String s = "Hello World!";
String convertedInUTF8 = new String(s, StandardCharsets.US_ASCII);