0

I have this string "\U05d0\U05d5\U05d2\U05e0\U05d3\U05d4","\U05d0\U05d5\U05d6\U05d1\U05e7\U05d9\U05e1\U05d8\U05df","\U05d0\U05d5\U05e1\U05d8\U05e8\U05d9\U05d4"

how do i convert it to a readable string? (note this is supposed to be hebrew)

i tried this method but it didnt work

byte[] bytes = s.getBytes();
String decoded = new String(bytes); 
System.out.println(decoded);
Stefan Falk
  • 23,898
  • 50
  • 191
  • 378
Lena Bru
  • 13,521
  • 11
  • 61
  • 126
  • Not that straight forward. See http://stackoverflow.com/questions/3537706/howto-unescape-a-java-string-literal-in-java for hints. – nos May 21 '14 at 09:46
  • **Never use `String.getBytes()` or `String(byte[])`.** They are machine-dependent, they use default system encoding and they often lead to data corruption. – Karol S Jul 10 '14 at 20:28
  • @KarolS thanks :) however, since i needed a 1 time conversion, this was OK. However, I would like to know if there is a more universal approach to this problem, then ? – Lena Bru Jul 10 '14 at 21:12

1 Answers1

0

All U should be lowercase u:

    String s = "\u05d0\u05d5\u05d2\u05e0\u05d3\u05d4";

    try{

        byte[] bytes   = s.getBytes();
        String decoded = new String(bytes); 

        System.out.println(decoded);

    } catch(UnsupportedEncodingException e) {      
        // ...  
    }

See Byte Encodings and Strings.

Output:

אוגנדה
Stefan Falk
  • 23,898
  • 50
  • 191
  • 378
  • 1
    This will horribly break on systems where the current character set isn't idempotent. A simple System.out.println(s) should work if the system character ser supports hebrew. – Tassos Bassoukos May 21 '14 at 10:06
  • You are right. Basically you can get the character encoding and call `getBytes(getChararcterEncoding())` then this shouldn't be a problem anymore. I edited my answer. – Stefan Falk May 21 '14 at 10:27
  • sorry but without the getBytes part, it does not work – Lena Bru May 21 '14 at 11:08