8

The code below gives me the Unicode string as கா

sysout = new PrintStream(System.out, true, "UTF-8");
sysout.println("\u0B95\u0bbe");

By giving கா as input, can I get the hex values as \u0B95 and \u0bbe?

PS: This is Tamil language.

Santosh Jadi
  • 1,479
  • 6
  • 29
  • 55
user1611248
  • 708
  • 3
  • 7
  • 13
  • It is not a duplicate I guess. The solution is for single char. But கா is combination of two char. That is why you have two hex values. – user1611248 May 18 '13 at 15:38

2 Answers2

7

You can use the format functionality to print the Java UTF-16 string escapes.

For example, this code writes the escapes to STDOUT:

String str = "கா";
for(char ch : str.toCharArray())
   System.out.format("\\u%04x", (int) ch);
McDowell
  • 107,573
  • 31
  • 204
  • 267
6

According to this you'll have to try

System.out.println( "\\u" + Integer.toHexString('க' | 0x10000).substring(1) );

but it will only work on Unicode up to 3.0. If you want to get more values, just create a loop, e.g.

String foo = "கா";
for (int i = 0; i < foo.length(); i++)
    System.out.println( "\\u" + Integer.toHexString(foo.charAt(i) | 0x10000).substring(1));

which produces

\u0b95
\u0bbe

If you want to have them in one line, change System.out.println() to System.out.print() and add System.out.print("\n") in the end.

Community
  • 1
  • 1
Mateusz
  • 3,038
  • 4
  • 27
  • 41