4

I have looked for solutions, but there doesn't seem to be much on this topic. I have found solutions that suggest:

String unicodeString = new String("utf8 here");
byte[] bytes = String.getBytes("UTF8"); 
String converted = new String(bytes,"UTF16");

for converting to utf16 from utf8, however, java doesn't handle "UTF32", which makes this solution unviable. Does anyone know any other way on how to achieve this?

Daniel Medina Sada
  • 478
  • 1
  • 5
  • 16

3 Answers3

4

after searching I got this to work:

    public static String convert16to32(String toConvert){
        for (int i = 0; i < toConvert.length(); ) {
            int codePoint = Character.codePointAt(toConvert, i);
            i += Character.charCount(codePoint);
            //System.out.printf("%x%n", codePoint);
            String utf32 = String.format("0x%x%n", codePoint);
            return utf32;
        }
        return null;
    }
Daniel Medina Sada
  • 478
  • 1
  • 5
  • 16
  • Glad you found a working solution! Sorry for not keeping my promise :P I worked on my code but ran into strange issues that I could not reproduce on other systems. My idea involved using `codePointAt()` as well and it was generally quite similar (just in case you were curious). – rhino Apr 04 '16 at 04:53
  • 2
    `String utf32` should be declared above the loop, and `return utf32;` should be after the loop, Otherwise, there is no point in having the loop at all. – Remy Lebeau Mar 22 '18 at 23:44
3

Java does handle UTF-32, try this test

    byte[] a = "1".getBytes("UTF-32");
    System.out.println(a.length);

it will show that arrays' lentgh = 4

Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
1
public static char[] bytesToHex(byte[] raw) {
    int length = raw.length;
    char[] hex = new char[length * 2];
    for (int i = 0; i < length; i++) {
        int value = (raw[i] + 256) % 256;
        int highIndex = value >> 4;
        int lowIndex = value & 0x0f;
        hex[i * 2 + 0] = kDigits[highIndex];
        hex[i * 2 + 1] = kDigits[lowIndex];
    }
    return hex;
}



byte[] bytearr = converted.getBytes("UTF-32");
System.out.println("With UTF-32 encoding:\t" + String.valueOf(bytesToHex(bytearr)));
System.out.println("With UTF-32 decoding:\t" + new String((bytearr), "UTF-32"));
pczeus
  • 7,709
  • 4
  • 36
  • 51