-2

I have a string like this in Java: "\xd0\xb5\xd0\xbd\xd0\xb4\xd0\xbf\xd0\xbe\xd0\xb9\xd0\xbd\xd1\x82"

How can I convert it to a human readable equivalent?

Note: actually it is GWT and this string is coming from python as part of a JSON data. The JSONParser transforms it to something that is totally irrelevant, so I want to be able to convert the string prior to parsing.

The expected, so called by me "human readable", should be "ендойнт" (https://mothereff.in/utf-8#%D0%B5%D0%BD%D0%B4%D0%BF%D0%BE%D0%B9%D0%BD%D1%82)

shruti1810
  • 3,920
  • 2
  • 16
  • 28
ILAPE
  • 47
  • 1
  • 5
  • What is it encoded as? – Sotirios Delimanolis Jul 09 '15 at 15:29
  • 2
    What would the "human readable" form of that string be? – Jon Skeet Jul 09 '15 at 15:30
  • I am a human, I can read the string you have posted, also you have not posted how they were encoded??? – We are Borg Jul 09 '15 at 15:40
  • it should be "ендпойнт" https://mothereff.in/utf-8#%D0%B5%D0%BD%D0%B4%D0%BF%D0%BE%D0%B9%D0%BD%D1%82 – ILAPE Jul 09 '15 at 15:48
  • You link to some site that can decode it (using utf8.js) so the answer to your question seems to be: use utf8.js. That said, `\xHH` is not valid in a JSON string, so you probably have a bug in your Python serializer. Also, maybe try `new String(str.getBytes("ISO-8859-1"), "UTF-8")`. – Thomas Broyer Jul 09 '15 at 16:12

2 Answers2

0

Assuming that the pattern is a repetition of characters in the form of "\x00", where 00 can be any number or letter in [a-fA-F], you can convert it with something like this:

String values = "\\xd0\\xb5\\xd0\\xbd\\xd0\\xb4\\xd0\\xbf\\xd0\\xbe\\xd0\\xb9\\xd0\\xbd\\xd1\\x82";
for (String val: values.split("\\\\x")) {
        if (val.length() > 0 ) System.err.print((char) Integer.parseInt(val, 16));
}

Note that the if condition is due to the first delimiter: see How to prevent java.lang.String.split() from creating a leading empty string?

Community
  • 1
  • 1
Andrea Iacono
  • 772
  • 7
  • 20
0

I don't know if it's just my console or it's not working, but you may try this code:

import java.io.UnsupportedEncodingException;

import javax.xml.bind.DatatypeConverter;

public class Utf8Decoder {

    public static void main(String[] args) {
        // TODO Auto-generated method stub

        String url = "\\xd0\\xb5\\xd0\\xbd\\xd0\\xb4\\xd0\\xbf\\xd0\\xbe\\xd0\\xb9\\xd0\\xbd\\xd1\\x82";
        url= url.replaceAll("\\\\x", ""); //remove the \x on the string... 
        //it is now hex so let's parse it
        //convert to human readable text
        String result="";
        try {
            byte[] bytes= DatatypeConverter.parseHexBinary(url);
            result = new String(bytes, "UTF-8");
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
            System.out.print("decoded value:"+result);
    }

}
triForce420
  • 719
  • 12
  • 31