2

I am trying to pull an JSON file from the backend containing unicodes for emoji. These are not the legacy unicodes (example: \ue415), but rather unicodes that work cross platform (example: \U0001F604).

Here is a sample piece of the json getting pulled:

[
 {
 "unicode": "U0001F601",
 "meaning": "Argh!"
 },
 {
 "unicode": "U0001F602",
 "meaning": "Laughing so hard"
 }
]

I am having difficulty converting these strings into unicodes that will display as emoji within the app.

Any help is greatly appreciated!

Jeremy H
  • 452
  • 7
  • 20

1 Answers1

5

In order to convert these unicode characters into NSString you will need to get bytes of those unicode characters.

After getting bytes, it is easy to initialize an NSString with bytes. Below code does exactly what you want. It assumes jsonArray is the NSArray generated from your json getting pulled.

// initialize using json serialization (possibly NSJSONSerialization)
NSArray *jsonArray; 

[jsonArray enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
    NSString *charCode = obj[@"unicode"];

    // remove prefix 'U'
    charCode = [charCode substringFromIndex:1];

    unsigned unicodeInt = 0;

    //convert unicode character to int
    [[NSScanner scannerWithString:charCode] scanHexInt:&unicodeInt];


    //convert this integer to a char array (bytes)
    char chars[4];
    int len = 4;

    chars[0] = (unicodeInt >> 24) & (1 << 24) - 1;
    chars[1] = (unicodeInt >> 16) & (1 << 16) - 1;
    chars[2] = (unicodeInt >> 8) & (1 << 8) - 1;
    chars[3] = unicodeInt & (1 << 8) - 1;


    NSString *unicodeString = [[NSString alloc] initWithBytes:chars
                                                       length:len
                                                     encoding:NSUTF32StringEncoding];

    NSLog(@"%@ - %@", obj[@"meaning"], unicodeString);
}];
ryumer
  • 536
  • 4
  • 6
  • 2
    Thanks! This has helped me make progress over the last few days. Now I have come across another need: converting it back into a string using the U0001F601 format. I have come across many answers that involve the \ue415 format, but not the U0001F601 format. Would you happen to have a solution handy for that approach as well? – Jeremy H Jul 11 '14 at 15:52
  • 1
    None of your &'s are required. When right shifting unsigned numbers, the right shift is a logical (zero fill) shift. And in any event, the assignment to the 8 bit char type would truncate the value anyway. And even if they were required, all of them should be “0xFF” (or if you really prefer, “((1 << 8)-1)”, including the first two, since you only ever want the resulting bottom 8 bits. – Peter N Lewis Aug 24 '18 at 04:05