0

Possible Duplicate:
unicode escapes in objective-c

I have a LATIN1 string.

Artîsté

When I json_encode it, it escapes some chars and converts it to single byte UTF8.

Art\u00eest\u00e9

If I just json_decode it, I believe it is decoding in UTF8

Artîsté

In order to get my original string back, I have to call utf8_decode

Artîsté

Is there a way to handle this conversion in objective-c?

Community
  • 1
  • 1
joels
  • 7,249
  • 11
  • 53
  • 94
  • 1
    What exactly are json_encode, json_decode and utf8_decode? Functions/methods that you’ve implemented? Some library you’re using? –  Oct 25 '11 at 21:43
  • 1
    Double-post of [unicode escapes in objective-c](http://stackoverflow.com/questions/7893879/unicode-escapes-in-objective-c) – jscs Oct 25 '11 at 22:00

2 Answers2

1

You might be looking for this:

NSString *string = (some string with non-ASCII characters in it);
char const *string_as_latin1 = [string cStringUsingEncoding:NSISOLatin1StringEncoding];

or possibly this:

NSData *data_latin1 = [string dataUsingEncoding:NSISOLatin1StringEncoding allowLossyConversion:YES];
rob mayoff
  • 375,296
  • 67
  • 796
  • 848
  • Your solution would have worked right if my code wasn't wrong. Everything I was trying was using messed up characters in the encoding. It should have been returning Art\u00eest\u00e9 but stuff behind the scenes was changing my input string and the results were Art\u00c3\u00aest\u00c3\u00a9 so I was trying to decode the wrong string. – joels Oct 25 '11 at 22:21
1

I have a LATIN1 string.

I don't think you do. Assuming you are talking about PHP, json_encode() only accepts UTF-8 strings, and bails out if it hits a non-UTF-8 high-byte sequence:

json_encode("Art\xeest\xe9")
"Art"
json_encode("Art\xc3\xaest\xc3\xa9")
"Art\u00eest\u00e9"

I think you had a proper UTF-8 string to start with, then you encoded and decoded it to get the exact same UTF-8 string back. But then you're displaying it or processing it in another step you haven't shown us, that treats your string as if it were Latin-1.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • I think you are right there. There was a section of code that converted it to utf8 and I forgot it was there... – joels Oct 25 '11 at 22:18