0
String s1="\u0048\u0065\u006C\u006C\u006F";   // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";  // ಮುಖಪುಟ (Kannada Language)

System.out.println("s1: " + StringEscapeUtils.unescapeJava(s1));  // s1: Hello
System.out.println("s2: " + StringEscapeUtils.unescapeJava(s2));  // s2: ??????

When I print s1, I get the result as Hello. When I print s2, I get the result as ???????.

I want the output as ಮುಖಪುಟ for s2. How can I achieve this?

Santosh DJ
  • 53
  • 2
  • 10
  • Where do you want to get the output: on the command prompt of Windows/Linux or the console view of an IDE? Please mention. – Sanjeev Saha Jun 02 '16 at 05:26
  • @SanjeevSaha IDE Console.. – Santosh DJ Jun 02 '16 at 05:37
  • Possible duplicate of [What is character encoding and why should I bother with it](http://stackoverflow.com/questions/10611455/what-is-character-encoding-and-why-should-i-bother-with-it) – Raedwald Jun 10 '16 at 06:56
  • @Raedwald No. my question is not related to your suggested question – Santosh DJ Jun 10 '16 at 07:13
  • 2 of the 3 answers suggest properly setting the character encoding. The text of your question does not indicate any awareness of the importance of character encoding for this kind of problem. – Raedwald Jun 10 '16 at 07:56

4 Answers4

1
 ByteArrayOutputStream os = new ByteArrayOutputStream();
 PrintStream ps = new PrintStream(os);
 ps.println("\u0048\u0065\u006C\u006C\u006F \u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F");  
 String output = os.toString("UTF8");
 System.out.println("result: "+output);   //  Hello ಮುಖಪುಟ 
Santosh DJ
  • 53
  • 2
  • 10
0

You need to add the encoding like "UTF-8" try this

String s1="\u0048\u0065\u006C\u006C\u006F";   // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";  // ಮುಖಪುಟ (Kannada Language)

System.out.println("s1: " + new String(s1.getBytes("UTF-8"), "UTF-8"));
System.out.println("s2: " + new String(s2.getBytes("UTF-8"), "UTF-8"));
Igoranze
  • 1,506
  • 13
  • 35
  • @SantoshJadi can you show me the exact code and output. – Igoranze Jun 01 '16 at 13:56
  • still i didn't started the implementation, its just for a practice purpose. i just want to know, why that's not working for other languages. – Santosh DJ Jun 02 '16 at 05:19
  • This won't work because `new String(s2.getBytes("UTF-8"), "UTF-8")` is actually a NOP. – piet.t Jun 07 '16 at 05:50
  • @Igoranze but then I would expect `System.out.println("s2: " + s2);` to work just as well... – piet.t Jun 08 '16 at 07:08
  • @piet.t The difference is the Unicode "UTF-8", While a normal String uses "UTF-16", the new String is expected to use "UTF-8". See this link for more info about the differences. http://stackoverflow.com/a/496361/4985572 – Igoranze Jun 08 '16 at 07:23
  • @Ignoranze `java.lang.String` always uses UTF-16, no way around that. Only when converting to or from `byte`s e.g. when writing to or reading from a stream. – piet.t Jun 08 '16 at 07:29
0

If you are using Eclipse then please have a look at: https://decoding.wordpress.com/2010/03/18/eclipse-how-to-change-the-console-output-encoding/

Please simply output on the console as follows:-

String s1="\u0048\u0065\u006C\u006C\u006F";   
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";
System.out.println("s1: " + s1);  // s1
System.out.println("s2: " + s2);  // s2

Hope, this is helpful to you.

Sanjeev Saha
  • 2,632
  • 1
  • 12
  • 19
  • yes i tried this, only working for s1 (English Language), its not working for s2 (Kannada language), giving same output as ???????. – Santosh DJ Jun 02 '16 at 06:06
0

The problem is most probably that System.out is not prepared to deal with Unicode. It is an output stream that gets encoded in the so called default encoding.

The default encoding is most often (i.e. on Windows) some proprietary 8-bit character set, that simply can't handle unicode.

My tip: For the sake of testing, create your own PrintStream or PrintWriter with UTF-8 encoding.

Ingo
  • 36,037
  • 5
  • 53
  • 100
  • yes i check with PrintStream and its printing correctly. PrintStream printStream = new PrintStream(System.out, true, "UTF-8"); printStream.println("\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F"); how can i assign printed value to a String? – Santosh DJ Jun 03 '16 at 10:32
  • @Chetan How about `String foo = "\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";` – Ingo Jun 03 '16 at 12:23
  • its working with PrintStream in a console but when i send it to browser, it prints like ???????. – Santosh DJ Jun 03 '16 at 12:29
  • Please ask a new question if your original problem is solved. No follow-up question in comments. And make sure to explain what "send it to browser" actually means. – Ingo Jun 03 '16 at 14:42
  • i will add you +1, once i reach 15 reputations ;) – Santosh DJ Jun 04 '16 at 03:21