0

I'm reading a JSON from an URL. It is UTF8 formatted. When I load the UITableView It shows incorrect characters. Please find attached screenshot at row 2enter image description here

The code that reads the data is the following:

NSURL *myURL=[NSURL     URLWithString:@"http://www.bancariromani.it/cecadm/newClass/modules/rh/index.php?id_cup=15&json=1"];

NSError *error;
NSData *myData=[[NSData alloc]initWithContentsOfURL:myURL];
if(!myData){

    return;

}

NSArray *jasonArray=[NSJSONSerialization JSONObjectWithData:myData options:kNilOptions error:&error];

I've also tried this without luck:

NSURL *myURL=[NSURL URLWithString:@"http://www.bancariromani.it/cecadm/newClass/modules/rh/index.php?id_cup=15&json=1"];

 NSError *error;
NSString *string = [NSString stringWithContentsOfURL:myURL encoding:NSISOLatin1StringEncoding error:nil];

 NSData *myData = [string dataUsingEncoding:NSUTF8StringEncoding];

if(!myData){

    return;

}
NSArray *jasonArray=[NSJSONSerialization JSONObjectWithData:myData options:kNilOptions error:&error];

Where I'm i loosing the UTF8 format?

Thanks for helping me

Dario

Dario
  • 73
  • 9
  • Suggestion: Use [AFNetworking](https://github.com/AFNetworking/AFNetworking) for URL calls – Ali Riahipour May 17 '15 at 16:57
  • Checking what that URL returns, it returns perfectly fine JSON not containing any URL-encoded characters. It looks very much like you are adding them yourself at some point between parsing the JSON data and putting the text into your table view. – gnasher729 May 17 '15 at 17:32
  • Checking further, you didn't tell us that the JSON data contains URL's and you are downloading _those_ URLs, which don't contain any JSON whatsoever. So the problem that you have has nothing at all to do with JSON, and nothing at all to do with UTF-8. – gnasher729 May 17 '15 at 17:35

3 Answers3

2

Your data is using the HTML-way to store special characters. It is different from UTF-8 and is a way to add special characters using ASCII-codepoints.

See http://www.w3.org/TR/html4/charset.html#h-5.3 for how they work. A way to decode them is answered in HTML character decoding in Objective-C / Cocoa Touch.

Community
  • 1
  • 1
Mats
  • 8,528
  • 1
  • 29
  • 35
0

Do you mean the "'" part on second row? That's HTML and you can convert that by doing url encoding. You could try this method:

- (NSString *)stringByReplacingPercentEscapesUsingEncoding:(NSStringEncoding)encoding
JOM
  • 8,139
  • 6
  • 78
  • 111
  • 2
    `stringByReplacingPercentEscapesUsingEncoding:` is for replacing things like `%20` to a space. It doesn't work for HTML entities. – rmaddy May 17 '15 at 17:13
0

That ' is the HTML-escape of a character; that's not related to UTF-8 at all.

Either ask your WebService to stop encoding HTML entities with their percent-escapes, as there is generally no need for them to do that… or you can use a method to remove them, like with this code:

NSMutableString* yourString = [… mutableCopy];
CFStringTransform((CFMutableStringRef)yourString, NULL, kCFStringTransformToXMLHex, true);
NSLog(@"transformed string: %@", yourString);

Unfortunately, this only seems to work for HTML-entities expressed as hexadecimal codepoints, like ' and not the ones expressed as decimal codepoints, like &#039.

So here is a custom method to do just that (decoding decimal HTML-entities):

NSString* decodeHTMLEntities(NSString* string)
{
    NSRegularExpression* decimalEntity = [NSRegularExpression regularExpressionWithPattern:@"&#(\\d+);" options:0 error:nil];
    NSMutableString* resultString = [string mutableCopy];
    NSInteger __block offset = 0;
    [decimalEntity enumerateMatchesInString:string options:0 range:NSMakeRange(0,string.length)
                                 usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop)
     {
         unsigned decimalCode = [string substringWithRange:[result rangeAtIndex:1]].intValue;
         NSString* decodedChar = [NSString stringWithFormat:@"%C", (unichar)decimalCode];
         result = [result resultByAdjustingRangesWithOffset:offset];
         [resultString replaceCharactersInRange:result.range withString:decodedChar];
         offset += (NSInteger)decodedChar.length - (NSInteger)result.range.length;
     }];
    return [resultString copy];
}

(Of course it would be way better to ask your WebService provider to fix it at the source, as they have no valid reason to do that in the first place)

AliSoftware
  • 32,623
  • 6
  • 82
  • 77