2

We're trying to use an NSFetchedResultsController to return people names and populate a UITableView in sorted order, using localizedCompare:. We're also trying to provide a section index in the UI (the right column of first characters of each section). We provide the NSFetchedResultsController with a selector on our entity which provides the section each entity should belong to (specifically, the first character of the person's name, capitalized).

When dealing with people names which utilize Unicode code points we've run into an issue. NSFetchedResultsController complains the entities are not sorted by section.

Specifically:

reason=The fetched object at index 103 has an out of order section name 'Ø. Objects must be sorted by section name'}, {
reason = "The fetched object at index 103 has an out of order section name '\U00d8. Objects must be sorted by section name'";

The issue appears to be that the comparison value returned by localizedCompare: is different for the whole "word" versus the leading character.

The following tests pass though I would expect consistent comparison results between ("Ø" and "O") vs. ("Østerhus" and "Osypowicz").

- (void)testLocalizedSortOrder300
{
    NSString *str1 = @"Osowski";
    NSString *str2 = @"Østerhus";
    NSString *str3 = @"Osypowicz";

    NSString *letter1 = @"O";
    NSString *letter2 = @"Ø";

    //localizedCompare:

    //"Osowski" < "Østerhus"
    NSComparisonResult res = [str1 localizedCompare:str2];
    XCTAssertTrue(res == NSOrderedAscending, @"(localizedCompare:) Expected '%@' and '%@' to be NSOrderedAscending, but got %@", str1, str2, res == NSOrderedSame ? @"NSOrderedSame" : @"NSOrderedDescending");

    //"Østerhus" < "Osypowicz"
    res = [str2 localizedCompare:str3];
    XCTAssertTrue(res == NSOrderedAscending, @"(localizedCompare:) Expected '%@' and '%@' to be NSOrderedAscending, but got %@", str2, str3, res == NSOrderedSame ? @"NSOrderedSame" : @"NSOrderedDescending");

    //"O" < "Ø"
    res = [letter1 localizedCompare:letter2];
    XCTAssertTrue(res == NSOrderedAscending, @"(localizedCompare:) Expected '%@' and '%@' to be NSOrderedAscending, but got %@", letter1, letter2, res == NSOrderedSame ? @"NSOrderedSame" : @"NSOrderedDescending");
}

So, the question ultimately is, given a person name (or any other string) which utilize Unicode code points, how do we properly (in a localized manner) return a section name which will correspond with the sort order as dictated by localizedCompare:?

Additionally, what's going on with the localizedCompare: apparently treating "Ø" and "O" as NSOrderedSame when followed by additional characters?

levigroker
  • 2,087
  • 1
  • 21
  • 30

3 Answers3

0

I expect localizedCompare: is using a specific combination of NSStringCompareOptions flags that are causing this behavior. https://developer.apple.com/documentation/foundation/nsstringcompareoptions?preferredLanguage=occ

You might get the outcome you want by using compare:options: and turning on NSDiacriticInsensitiveSearch.

For generating the section index, it might be best to strip the value of all extended characters first, and then take the first letter. Something like:

[[str1 stringByFoldingWithOptions:NSCaseInsensitiveSearch | NSDiacriticInsensitiveSearch] substringToIndex:1]

That way a name starting with an accented letter such as "Édward" will get converted to "Edward" before you take the first letter for the section.

A. Goodale
  • 1,288
  • 11
  • 13
  • 1
    Thanks @a-goodale. Sadly we can't use `compare:options:` or similar. We're backed by an SQLite store so our options are limited. Specifically "The supported sort selectors for SQLite are compare: and caseInsensitiveCompare:, localizedCompare:, localizedCaseInsensitiveCompare:, and localizedStandardCompare:. The latter is Finder-like sorting and what most people should use most of the time. In addition, you cannot sort on transient properties using the SQLite store." See https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/CoreData/PersistentStoreFeatures.html – levigroker Jun 28 '17 at 19:24
  • Similarly, `stringByFoldingWithOptions:locale:` still yields "Ø" for "Østerhus" thus putting "Østerhus" in a different section, causing the same strange issue with `localizedCompare:` apparently treating "Ø" and "O" as NSOrderedSame when followed by additional characters. :( – levigroker Jun 28 '17 at 19:33
0

Yeah, been there. The only solution I found was to create a second field for search that simplifies the characters (don't remember off hand the method) and store it as a second field which is used for search. Not super elegant but it worked.

Jon Rose
  • 8,373
  • 1
  • 30
  • 36
  • Thanks. Yeah, I think the only robust approach is to store the normalized section names in the database so the FRC can sort on them. – levigroker Jun 28 '17 at 22:09
0

Ultimately the approach which solved this was to store normalized section names in the database.

@MartinR suggested SO post lead me to https://stackoverflow.com/a/13292767/397210 which talks about this approach and was the key "ah ha" moment to solve it.

While this does not explain the goofy behavior of localizedCompare: apparently treating "Ø" and "O" as NSOrderedSame when followed by additional characters it is, IMHO, a more robust and complete solution which works for all Unicode code points, in our testing.

Specifically, the approach is:

  1. Create (or utilize an existing) field on your entity to receive a normalized section name for the entity (let's call it sectionName).
  2. Populate this field (sectionName) with the normalized section name*, initially, and as needed (when the person name changes, for instance).
  3. Use this section name field (sectionName) for the sectionNameKeyPath of NSFetchedResultsController -initWithFetchRequest:managedObjectContext:sectionNameKeyPath:cacheName:
  4. For the sort descriptors used by the fetch request passed to the NSFetchedResultsController be sure to sort first by section name then by how to sort the contents of the section (person name, for instance), paying attention to the use of the localized version of the comparison selectors. e.g.:

    [NSSortDescriptor sortDescriptorWithKey:@"sectionName" ascending:YES selector:@selector(localizedStandardCompare:)],
    [NSSortDescriptor sortDescriptorWithKey:@"personName" ascending:YES selector:@selector(localizedCaseInsensitiveCompare:)]
    
  5. Test.

*Normalized Section Name

We need to be careful about assuming what the first "character" is when dealing with unicode. "Characters" may be composed of more than one character. See https://www.objc.io/issues/9-strings/unicode/ and also Compare arabic strings with special characters ios

This is the direction I used to generate a normalized section name:

    NSString *decomposedString = name.decomposedStringWithCanonicalMapping;
    NSRange firstCharRange = [decomposedString rangeOfComposedCharacterSequenceAtIndex:0];
    NSString *firstChar = [decomposedString substringWithRange:firstCharRange];
    retVal = [firstChar localizedUppercaseString];

Hopefully this approach is clear and useful to others, and thanks for the assist, all.

levigroker
  • 2,087
  • 1
  • 21
  • 30