6

I have a Core Data Model with three entities:
Person, Group, Photo with relationships between them as follows:

  • Person <<-----------> Group (one to many relationship)
  • Person <-------------> Photo (one to one)

When I perform a fetch using the NSFetchedResultsController in a UITableView, I want to group in sections the Person objects using the Group's entity name attribute.

For that, I use sectionNameKeyPath:@"group.name".

The problem is that when I'm using the attribute from the Group relationship, the NSFetchedResultsController fetches everything upfront in small batches of 20 (I have setFetchBatchSize: 20) instead of fetching batches while I'm scrolling the tableView.

If I use an attribute from the Person entity (like sectionNameKeyPath:@"name") to create sections everything works OK: the NSFetchResultsController loads small batches of 20 objects as I scroll.

The code I use to instantiate the NSFetchedResultsController:

- (NSFetchedResultsController *)fetchedResultsController {

    if (_fetchedResultsController) {
        return _fetchedResultsController;
    }

    NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
    NSEntityDescription *entity = [NSEntityDescription entityForName:[Person description]
                                              inManagedObjectContext:self.managedObjectContext];

    [fetchRequest setEntity:entity];

    // Specify how the fetched objects should be sorted
    NSSortDescriptor *groupSortDescriptor = [[NSSortDescriptor alloc] initWithKey:@"group.name"
                                                                        ascending:YES];

    NSSortDescriptor *personSortDescriptor = [[NSSortDescriptor alloc] initWithKey:@"birthName"
                                                                         ascending:YES
                                                                          selector:@selector(localizedStandardCompare:)];


    [fetchRequest setSortDescriptors:[NSArray arrayWithObjects:groupSortDescriptor, personSortDescriptor, nil]];

    [fetchRequest setRelationshipKeyPathsForPrefetching:@[@"group", @"photo"]];
    [fetchRequest setFetchBatchSize:20];

    NSError *error = nil;
    NSArray *fetchedObjects = [self.managedObjectContext executeFetchRequest:fetchRequest error:&error];

    if (fetchedObjects == nil) {
        NSLog(@"Error Fetching: %@", error);
    }

    _fetchedResultsController = [[NSFetchedResultsController alloc] initWithFetchRequest:fetchRequest
                                                                    managedObjectContext:self.managedObjectContext sectionNameKeyPath:@"group.name" cacheName:@"masterCache"];

    _fetchedResultsController.delegate = self;

    return _fetchedResultsController;
}

This is what I get in Instruments if I create sections based on "group.name" without any interaction with the App's UI: Core Data Fetch with Sections by Relationship

And this is what I get (with a bit of scrolling on UITableView) if sectionNameKeyPath is nil: Core Data Fetch without any Sections

Please, can anyone help me out on this issue?

EDIT 1:

It seems that I get inconsistent results from the simulator and Instruments: when I've asked this question, the app was starting in the simulator in about 10 seconds (by Time Profiler) using the above code.

But today, using the same code as above, the app starts in the simulator in 900ms even if it makes a temporary upfront fetch for all the objects and it's not blocking the UI.

I've attached some fresh screenshots: Time Profiler with Simulator Upfront Fetch in Simulator without scrolling Upfront Fetch in Simulator with scrolling and small batch fetches

EDIT 2: I reset the simulator and the results are intriguing: after performing an import operation and quitting the app the first run looked like this: First run after simulator reset and new import After a bit of scrolling: First run after simulator reset, new import and some scrolling Now this is what happens on a second run: Second run after simulator reset and new import After the fifth run: Fifth run

EDIT 3: Running the app the seventh time and eight time, I get this: Seventh run Eighth run

Razvan
  • 4,122
  • 2
  • 26
  • 44
  • 1
    I believe one Rick had suggested [this link](http://www.cimgf.com/2013/01/03/nsfetchedresultscontroller-sectionnamekeypath-discussion/) but his answer got moderated. Anyways... Have a go at it. He thought it could answer your question. – staticVoidMan Aug 30 '14 at 22:23
  • Well written question. – Lorenzo B Sep 14 '14 at 18:24
  • @codeFi, is one of your primary concerns here that the fetching is blocking user interaction? – quellish Sep 15 '14 at 08:40
  • @quellish Yes, it's blocking the user interaction when the app starts because it's taking a long time to present the UI but this issue happens only in the simulator. Strangely enough, when running the app on an iPhone 4S even if I prefetch both Group and Photo entities and use the name attribute of the Group entity as sectionNameKeyPath, the app loads in ~900ms. – Razvan Sep 15 '14 at 10:15
  • So I don't know what to make of it... in the simulator I get one thing, on the device another... – Razvan Sep 15 '14 at 10:21
  • If you time profile it, is most of that time the fetches, or something else? It's not unusual for adding the store to the persistent store coordinator to take a long time. – quellish Sep 15 '14 at 10:41
  • @quellish I've edited my post with new information. – Razvan Sep 15 '14 at 10:50
  • In the time profile, invert the call tree, don't segregate by thread, and show top calls, should make it obvious where the time is being spent – quellish Sep 15 '14 at 11:02
  • @quellish there's no point in doing that right now because it seems everything loads very fast (690ms). But if you want to know what's taking the most computational time right now is __pread from libsystem_kernel.dylib (75ms). – Razvan Sep 15 '14 at 11:11
  • @codeFi, in a comment you say "it seems that Core Data prefers to store the thumbs as binary in the database tables", is that on the simulator, or the device? The behavior and performance of external records storage can different significantly between the simulator and the device. – quellish Sep 15 '14 at 21:09
  • @quellish this happens in the simulator. I haven't looked at what's happening on the device from this perspective. – Razvan Sep 15 '14 at 21:11
  • When you specify in the model that Core Data can use external storage for a binary modeled attribute Core Data decides at runtime wether to store that binary data in the SQLite store or in an external file. Core Data's reasoning about that can be different when running on the device vs. the simulator. Additionally, migrating a store that has written external records files can be *very* slow. Is there any reason to think that you were performing a migration on the Instruments runs where you saw slowness? – quellish Sep 15 '14 at 21:29
  • No, I wasn't importing anything. The tests I did were with an already populated database. – Razvan Sep 15 '14 at 22:47

3 Answers3

1

This is your stated objective: "I need the Person objects to be grouped in sections by the relationship entity Group, name attribute and the NSFetchResultsController to perform fetches in small batches as I scroll and not upfront as it is doing now."

The answer is a little complicated, primarily because of how an NSFetchedResultsController builds sections, and how that affects the fetching behavior.

TL;DR; To change this behavior, you would need to change how NSFetchedResultsController builds sections.

What is happening?

When an NSFetchedResultsController is given a fetch request with pagination (fetchLimit and/or fetchBatchSize), several things happen.

If no sectionNameKeyPath is specified, it does exactly what you expect. The fetch returns an proxy array of results, with "real" objects for the first fetchBathSize number of items. So for example if you have setFetchBatchSize to 2, and your predicate matches 10 items in the store, the results contain the first two objects. The other objects will be fetched separately as they are accessed. This provides a smooth paginated response experience.

However, when a sectionNameKeyPath is specified, the fetched results controller has to do a bit more. To compute the sections it needs to access that key path on all the objects in the results. It enumerates the 10 items in the results in our example. The first two have already been fetched. The other 8 will be fetched during enumeration to get the key path value needed to build the section information. If you have a lot of results for your fetch request, this can be very inefficient. There are a number of public bugs concerning this functionality:

NSFetchedResultsController initially takes too long to set up sections

NSFetchedResultsController ignores fetchLimit property

NSFetchedResultsController, Table Index, and Batched Fetch Performance Issue

... And several others. When you think about it, this makes sense. To build the NSFetchedResultsSectionInfo objects requires the fetched results controller to see every value in the results for the sectionNameKeyPath, aggregate them to the unique union of values, and use that information to create the correct number of NSFetchedResultsSectionInfo objects, set the name and index title, know how many objects in the results a section contains, etc. To handle the general use case there is no way around this. With that in mind, your Instruments traces may make a lot more sense.

How can you change this?

You can attempt to build your own NSFetchedResultsController that provides an alternative strategy for building NSFetchedResultsSectionInfo objects, but you may run into some of the same problems. For example, if you are using the existing fetchedObjects functionality to access members of the fetch results, you will encounter the same behavior when accessing objects that are faults. Your implementation would need a strategy for dealing with this (it's doable, but very dependant on your needs and requirements).

Oh god no. What about some kind of temporary hack that just makes it perform a little better but doesn't fix the problem?

Altering your data model will not change the above behavior, but can change the performance impact slightly. Batch updates will not have any significant effect on this behavior, and in fact will not play nicely with a fetched results controller. It may be much more useful to you, however, to instead set the relationshipKeyPathsForPrefetching to include your "group" relationship, which may improve the fetching and faulting behavior significantly. Another strategy may be to perform another fetch to batch fault these objects before you attempt to use the fetched results controller, which will populate the various levels of Core Data in-memory caches in a more efficient manner.

The NSFetchedResultsController cache is primarily for section information. This prevents the sections from having to be completely recalculated on each change (in the best case), but can actually make the initial fetch to build the sections take much longer. You will have to experiment to see if the cache is worthwhile for your use case.

If your primary concern is that these Core Data operations are blocking user interaction, you can offload them from the main thread. NSFetchedResultsController can be used on a private queue (background) context, which will prevent Core Data operations from blocking the UI.

Community
  • 1
  • 1
quellish
  • 21,123
  • 4
  • 76
  • 83
  • 1
    You missed the point of my answer. Batches updates are used to update the new attribute (say `groupName`) on the `Person` entity when the `name` of a `Group` changes. And this is not related to `NSFethedResultsControlle`r in any way. About what you are saying here: *Altering your data model will not change the above behavior, but can change the performance impact slightly*. I guess it's not correct but just for be sure I will set up an example an returning with some tests. – Lorenzo B Sep 15 '14 at 07:18
  • The author's question is about `NSFetchedResultsController` performance when calculating sections for the first time. Batch updates are not relevant to this, and in common scenarios will be harmful - coordinating batch changes with a `NSFetchedResultsController` is non trivial. The behavior the author describes and is looking for guidance on is endemic to how NSFetchedResultsController must perform it's job. Normalizing the data model doesn't change this, it only means it will take a little more data to see the same performance impact - at best. – quellish Sep 15 '14 at 07:36
  • @quellish actually flexaddicted is right: a few days ago, I've tried using an attribute groupName on the Person entity and creating sections from it. It worked as expected: there were no upfront fetches anymore, the Person objects were grouped by groupName and the FetchController was fetching small batches as I was scrolling. However, from my point of view this approach defeats the purpose of having relationships. – Razvan Sep 15 '14 at 11:01
0

Based on my experience a way to achieve your goal is to denormalize your model. In particular, you could add a group attribute in your Person entity and use that attribute as sectionNameKeyPath. So, when you create a Person you should also pass the group it belongs to.

This denormalization process is correct since it allows you to avoid fetching of related Group objects since not necessary. A cons could be that if you change the name of a group, all the persons associated with that name must change, on the contrary you can have incorrect values.

The key aspect here is the following. You need to have in mind that Core Data is not a relational database. The model should not designed as a database schema, where normalization could take place, but it should be designed from the perspective of how the data are presented and used in the user interface.

Edit 1

I cannot understand your comment, could you explain better?

What I've found very intriguing though is that even if the app is performing a full upfront fetch in the simulator, the app loads in 900ms (with 5000 objects) on the device despite the simulator where it loads much slower.

Anyway, I would be interested in knowing details about your Photo entity. If you pre-fetch photo the overall execution could be influenced.

Do you need to pre-fetch a Photo within your table view? Are they thumbs (small photos)? Or normal images? Do you take advantage of External Storage Flag?

Adding an additional attribute (say group) to the Person entity could not be a problem. Updating the value of that attribute when the name of a Group object changes it's not a problem if you perform it in background. In addition, starting from iOS 8 you have available a batch update as described in Core Data Batch Updates.

Lorenzo B
  • 33,216
  • 24
  • 116
  • 190
  • Thanks for your useful answer. I've been thinking of that but as you said, the problem arrives when I want to change the group name for thousands of objects related to a particular group. What I've found very intriguing though is that even if the app is performing a full upfront fetch in the simulator, the app loads in 900ms (with 5000 objects) on the device despite the simulator where it loads much slower. I think something else it's happening on the device... The device being an iPhone 4S with the latest iOS 7 build. – Razvan Sep 14 '14 at 19:52
  • Regarding your last phrase, you are absolutely right! I will not present to the user thousands of objects grouped by sections in a single tableview. Instead I will have a tableview with some cells representing the groups and segues from each of them to the content of each group. But this is a personal exercise for me to understand Core Data and this issue is very frustrating for me. – Razvan Sep 14 '14 at 20:00
  • 1
    Added some other hints. – Lorenzo B Sep 14 '14 at 21:45
  • To the downvoter. Comments on downvotes should be inserted too. – Lorenzo B Sep 15 '14 at 07:07
  • The author's stated goal is: "I need the Person objects to be grouped in sections by the relationship entity Group, name attribute and the NSFetchResultsController to perform fetches in small batches as I scroll and not upfront as it is doing now." Your answer does not address this. – quellish Sep 15 '14 at 07:38
  • 1
    @quellish dernomalizing the model could improve preformance. If you put a new attribute, say `groupName`, in the `Person` entity, that attribute could be used to group and to avoid upfront fetches. – Lorenzo B Sep 15 '14 at 08:19
  • @flexaddicted denormalizing the model will not change the fetch behavior when building the initial sections, which is what the author is asking for guidance on. Denormalizing the model will at best make it take less time to do the thing he already doesn't want to do. If you sufficiently deMORALIZED the fetched results controller, maybe that would change it's behavior. But probably not. – quellish Sep 15 '14 at 08:25
  • @flexaddicted on the device, the app loads very fast (in 900ms) while in the simulator loads in a few seconds. – Razvan Sep 15 '14 at 10:34
  • @flexaddicted regarding the photos I use to display in the tableview: they are small thumbnails (5.000 of 48KB each) and the attribute has Allow External Storage option enabled. However, it seems that Core Data prefers to store the thumbs as binary in the database tables because the sqlite file reaches ~130MB in size. – Razvan Sep 15 '14 at 10:37
0

After almost a year since I've posted this question, I've finally found the culprits that enable this behaviour (which slightly changed in Xcode 6):

  1. Regarding the inconsistent fetch times: I was using a cache and at the time I was back and forth with opening, closing and resetting the simulator.

  2. Regarding the fact that everything was fetched upfront in small batches without scrolling (in Xcode 6's Core Data Instruments that's not the case anymore - now it's one, big fetch which takes entire seconds):

It seems that setFetchBatchSize does not work correctly with parent/child contexts. The issue was reported back in 2012 and it seems that it's still there http://openradar.appspot.com/11235622.

To overcome this issue, I created another independent context with an NSMainQueueConcurrencyType and set its persistence coordinator to be the same that my other contexts are using.

More about issue #2 here: https://stackoverflow.com/a/11470560/1641848

Community
  • 1
  • 1
Razvan
  • 4,122
  • 2
  • 26
  • 44