1

I have an issue receiving a string from a PHP backend into my iOS app. The string I receive looks like this:

Test ððððð

Those special characters should be smileys. Now I checked with this encoder here: https://encoder.mattiasgeniar.be/index.php and the string is UTF-8 encoded indeed the one with smileys.

Test

Now I wonder what encoding is the source string? And how can I convert it to an UTF-8 string that displays correctly on iOS?

I've tried

NSData *decodedData = [[NSData alloc] initWithBase64EncodedString:@"Test ððððð" options:0];
NSString *message = [[NSString alloc] initWithData:decodedData encoding:NSUTF8StringEncoding];

and

NSString *message = (__bridge_transfer NSString *)CFURLCreateStringByReplacingPercentEscapesUsingEncoding(NULL, (CFStringRef)@"Test ððððð", CFSTR(""), kCFStringEncodingUTF8);

and also

NSString *message = [@"Test ððððð" stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

but none of those worked. I kind of baffled what the source string is encoded like.

jscs
  • 63,694
  • 13
  • 151
  • 195
Martin Schultz
  • 2,571
  • 2
  • 26
  • 31

1 Answers1

2

There's probably nothing wrong with your Foundation app (which, by the way, natively supports UTF-8 & UTF-16 very, very well).


To answer your last question:

I'm kind of baffled what the source string is encoded like.

If you crack open that string and take a look at it in terms of bytes, you'll notice that the eth character ('ð' [Icelandic and Faroese use this character]) is UTF-8 codepoint 0xf0.

0xf0 is also the beginning of a UTF-8 surrogate sequence to begin encoding the Emoji character '' above (0xf0, 0x9f, 0x98, 0x80). The rest of the 3 bytes for the Emoji are lost.

[TL;DR]

Something in your backend, maybe PHP itself, isn't supporting Unicode very well.

Community
  • 1
  • 1
Sean
  • 5,233
  • 5
  • 22
  • 26
  • Actually, it was "broken content" in the database. I tested to display only the last message that was entered via the phone and that showed the smileys. Then I added all other messages and then it got broken. Seems there was some weird data in the table. – Martin Schultz Sep 27 '16 at 08:26