-1

I have the following code to convert unicode to bytes, it works:

    byte[] emojiBytes = new byte[]{(byte)0xF0,(byte)0x9F,(byte)0x98,(byte)0x81};
    String emojiAsString = new String(emojiBytes,Charset.forName("UTF-8"));
    // JButton button = new JButton("<html>" + emojiAsString + "</html>");
    JButton button = new JButton(emojiAsString);

But what if I only know the unicode like this : 1F601 , 1F603, I want to convert symbols on this page : https://apps.timwhitlock.info/emoji/tables/unicode

Given a string like 1F601, how do I convert it to \xF0\x9F\x98\x81 then to new byte[]{(byte)0xF0,(byte)0x9F,(byte)0x98,(byte)0x81}?

So to simplify, my code would look like this:

JButton getButton(String unicodeText)
{
    JButton aButton= // how to convert ???

    return aButton;
}

Then I call it like this: JButton myButton=getButton("1F601");

Dale K
  • 25,246
  • 15
  • 42
  • 71
Frank
  • 30,590
  • 58
  • 161
  • 244
  • so you want to [convert hex string to byte array](https://stackoverflow.com/questions/140131/convert-a-string-representation-of-a-hex-dump-to-a-byte-array-using-java) – Scary Wombat Jul 03 '19 at 00:27

1 Answers1

1

The hex string gives a hex number which is a Unicode code point; that then needs to be converted to UTF-8. The trouble is that the code point exceeds 0xFFFF, which means it's not directly representable as a Java char.

After a little research, here is one quick and dirty test program.

Character.toChars converts to a char array with which we construct a String;

getBytes() then converts that to UTF-8 bytes.

Even though the String uses UTF-16, the subsequent conversion treats the single character as a whole (which is required by standard, actually).

import java.nio.charset.StandardCharsets;

class Z {
     public static void main(String[] args) {
        int cp = 0x1f601;
        byte b[] = new String(Character.toChars(cp)).getBytes(StandardCharsets.UTF_8);
        for (int k=0; k<b.length; k++) 
              System.out.printf(" %x ", b[k]);
        System.out.println();
     }
}

The output is:

$ java Z
 f0  9f  98  81
  • It is implicitly assumed that converting the String "1f601" to an int 1f601 does not need explanation. –  Jul 03 '19 at 02:27
  • Seems it solved half the problem, could you take a string input like "1F601" and generate an output of "new byte[]{(byte)0xF0,(byte)0x9F,(byte)0x98,(byte)0x81}" ? I tried this which didn't work : int cp = Integer.parseInt("1f601") – Frank Jul 03 '19 at 03:28
  • It's hexadecimal -- ```Integer.parseInt("1f601", 16)``` –  Jul 03 '19 at 11:18