How do I split a string with special characters into a NSMutableArray

Question

I'am trying to seperate a string with danish characters into a NSMutableArray. But something is not working. :(

My code:

NSString *danishString = @"æøå";

NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; 

for (int i=0; i < [danishString length]; i++) 
{ 
     NSString *ichar = [NSString stringWithFormat:@"%c", [danishString characterAtIndex:i ]]; 
     [characters addObject:ichar]; 
}

If I do at NSLog on the danishString it works (returns æøå);

But if I do a NSLog on the characters (the array) I get some very stange characters - What is wrong?

/Morten

It's too bad whoever down voted everything in this thread, because IMO this is a good question. — Jason Coco, Jan 05 '12 at 09:52

score 2 · Accepted Answer · answered Jan 05 '12 at 12:02

First of all, your code is incorrect. characterAtIndex returns unichar, so you should use @"%C"(uppercase) as the format specifier.

Even with the correct format specifier, your code is unsafe, and strictly speaking, still incorrect, because not all unicode characters can be represented by a single unichar. You should always handle unicode strings per substring:

It's common to think of a string as a sequence of characters, but when working with NSString objects, or with Unicode strings in general, in most cases it is better to deal with substrings rather than with individual characters. The reason for this is that what the user perceives as a character in text may in many cases be represented by multiple characters in the string.

You should definitely read String Programming Guide.

Finally, the correct code for you:

NSString *danishString = @"æøå";
NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; 
[danishString enumerateSubstringsInRange:NSMakeRange(0, danishString.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
    [characters addObject:substring];
}];

If with NSLog(@"%@", characters); you see "strange character" of the form "\Uxxxx", that's correct. It's the default stringification behavior of NSArray by description method. You can print these unicode characters one by one if you want to see the "normal characters":

for (NSString *c in characters) {
    NSLog(@"%@", c);
}

score 0 · Answer 2 · answered Jan 05 '12 at 09:45

In your example, ichar isn't type of NSString, but unichar. If you want NSStrings try getting a substring instead :

NSString *danishString = @"æøå";
NSMutableArray *characters = [[NSMutableArray alloc] initWithCapacity:[danishString length]]; 

for (int i=0; i < [danishString length]; i++) 
{ 
    NSRange r = NSMakeRange(i, 1);
    NSString *ichar = [danishString substringWithRange:r]; 
    [characters addObject:ichar]; 
}

score 0 · Answer 3 · answered Jan 05 '12 at 09:47

You could do something like the following, which should be fine with Danish characters, but would break down if you have decomposed characters. I suggest reading the String Programming Guide for more information.

NSString *danishString = @"æøå";
NSMutableArray* characters = [NSMutableArray array];
for( int i = 0; i < [danishString length]; i++ ) {
  NSString* subchar = [danishString substringWithRange:NSMakeRange(i, 1)];
  if( subchar ) [characters addObject:subchar];
}

That would split the string into an array of individual characters, assuming that all the code points were composed characters.

score -1 · Answer 4 · answered Jan 05 '12 at 09:44

-1

It is printing the unicode of the characters. Anyhow, you can use the unicode (with \u) anywhere.

answered Jan 05 '12 at 09:44

Ilanchezhian

17,426
1
53
55

How do I split a string with special characters into a NSMutableArray

4 Answers4

Linked