-2

I'm trying to detect the character â in a string in Objective C and can't seem to get it to work. It's displaying a bullet point when it's finally displayed on screen, so maybe that's why I can't detect it?

In iOS 10 these bullet points display larger than they should, so I need to find the range of each of these characters and make them a few sizes smaller. I've tried the following:

[inputString contains:@"â"]
[inputString contains:@"•"]
[inputString contains:@"\u00b7"]
[inputString contains:@"\u2022"]

The one that interests me the most is when I copy and paste exactly from the API response: [inputString contains:@"â "]. There's actually 4 or 5 spaces in that string, but they get truncated when pasting from the JSON I get back -- I'm not sure why but I feel like that has to do with why I can't recognize the string contains that character.

Any ideas on how to correctly deal with this character?

Edit: Few more details, here's the string that gets sent back from API:

â All of your exclusive deals in one place\nâ More deals matched specifically to you\nâ Get alerts to know when new deals are available or your saved deals are expiring"

I noticed something weird as well, when I edit the response and add in more of those a's with a hat, they get moved into bullet points, however when I add them into the string in code, they are displayed as simply bullet points. Maybe they're being encoded somehow? Although I don't see anywhere in our code where that could be happening, so I'm a little confused as to what's going on here.

Edit 2: Here's a hexdump of the line, this is probably more useful to some of you than it is to me:

000026c0  6e 74 65 6e 74 22 3a 20  22 e2 97 8f 20 41 6c 6c  |ntent": "... All|
000026d0  20 6f 66 20 79 6f 75 72  20 65 78 63 6c 75 73 69  | of your exclusi|
000026e0  76 65 20 64 65 61 6c 73  20 69 6e 20 6f 6e 65 20  |ve deals in one |
000026f0  70 6c 61 63 65 5c 6e e2  97 8f 20 4d 6f 72 65 20  |place\n... More |
00002700  64 65 61 6c 73 20 6d 61  74 63 68 65 64 20 73 70  |deals matched sp|
00002710  65 63 69 66 69 63 61 6c  6c 79 20 74 6f 20 79 6f  |ecifically to yo|
00002720  75 5c 6e e2 97 8f 20 47  65 74 20 61 6c 65 72 74  |u\n... Get alert|
00002730  73 20 74 6f 20 6b 6e 6f  77 20 77 68 65 6e 20 6e  |s to know when n|
00002740  65 77 20 64 65 61 6c 73  20 61 72 65 20 61 76 61  |ew deals are ava|
00002750  69 6c 61 62 6c 65 20 6f  72 20 79 6f 75 72 20 73  |ilable or your s|
00002760  61 76 65 64 20 64 65 61  6c 73 20 61 72 65 20 65  |aved deals are e|
00002770  78 70 69 72 69 6e 67 22  2c 0d 0a 20 20 20 20 22  |xpiring",..    "|
Bill L
  • 2,576
  • 4
  • 28
  • 55
  • Can you show your response string? – Nirav D Aug 23 '16 at 15:40
  • Have you looked into normalizing the strings? https://www.objc.io/issues/9-strings/unicode/#normalization-forms – Mats Aug 23 '16 at 15:43
  • I tried logging out the four forms of normalized strings and they all still became bullet points in my console – Bill L Aug 23 '16 at 15:53
  • 1
    You're nearly certainly having UTF-8 encoding issues, though it's difficult from the information provided to know where (it could even be server-side). Can you run the output of your API call through hex dump to see what exactly you're receiving? Something like `curl address | hexdump -C` and isolate the relevant bits. Also, can you show the code you use to fetch the data and convert it? – jcaron Aug 23 '16 at 16:20
  • The code I'm using to fetch and convert it is all RestKit. RestKit is hitting the endpoint, and simply plugging things in to the correct properties. I'm not doing anything special other than telling RestKit where to put each property. I'm pasting in the hexdump of the string I got back now, I'm not sure how to read this too well, but it seems that it is displaying that "a" as an elipsis.... – Bill L Aug 23 '16 at 16:36

2 Answers2

1

I'm trying to detect the character â in a string

There is no "â" in your text, so there is nothing to detect. e2 97 8f is a bullet character, "●". Your problem is that you're not setting the encoding correctly.

matt
  • 515,959
  • 87
  • 875
  • 1,141
1

The bytes e2 97 8f in your dump is the UTF8 encoding of U+25CF, BLACK CIRCLE. When interpreted as ISO-8859 or Windows-1252 e2 is â (a circumflex), 97 is an em dash, and 8f is unused.

This indicates the JSON itself is UTF8 and somewhere is being interpreted differently, probably as one of the above encodings. You need to check both in your code and in the full server response (for an example of the latter causing an issue see the question JSON character encoding).

Community
  • 1
  • 1
CRD
  • 52,522
  • 5
  • 70
  • 86
  • That did it! Your explanation makes perfect sense as to what I was seeing as well, thank you for the help and the clear explanation! – Bill L Aug 23 '16 at 18:00