8

I'm now stuck for about two weeks with a nasty Core Data problem. I read lots of blogpost, articles and SO questions/answers but I'm still not able to solve my problem.

I ran lots of tests and was able to reduce the larger problem to a smaller one. It's going to be a large explanation so keep with me!

Problem - datamodel

I have to got following datamodel:

Object A has one-to-many relation with object B which has another one-to-many relation with object C. Because of Core Data recommendations I have to create inverse relations so each instance of B points to its parent A and the same for C which points to its parent B.

A <->> B <->> C

Problem - MOC setup

To keep responsiveness smooth as butter I created a three-level managedObjectContext structure.

  1. Parent MOC - Runs on its own private thread using NSPrivateQueueConcurrencyType, is tight to the persistentStoreCoordinator
  2. MainQueue MOC - Runs on the mainThread using NSMainQueueConcurrencyType and has parent MOC 1
  3. For each parsing operation I create a third MOC which also has its private queue and has parent mainQueue MOC

My main datacontroller is added as an observer to the NSManagedObjectContextDidSave notification of MOC 2 so every time MOC 2 saves a performBlock: on MOC1 is triggered which performs a save operation (asynchronously because of performBlock:).

Problem - Parsing

To perform parsing a large JSON file into my Core Data structure I wrote a recurrent parser. This parser starts by creating a new MOC (3). It then takes the data for object A and parses its properties. Then the parser reads out the JSON relations for B and create the corresponding objects which are filled with data. These new objects are added to A by calling addBObject: on A. Because the parser is recurrent, parsing B means parsing C and here also new objects are created and attached to B. This all happens in the performBlock: on MOC 3.

  • Parse (creates 'A'-objects and starts parsing B)
    • Parsing A (creates 'B'-objects, attaches them to A and starts parsing C)
      • Parsing B (creates 'C'-objects, attaches them to B)
        • Parsing C (just stores data in a C-object)

After each parsing operation I save MOC 3 and dispatches on the mainThread a save operation of the main MOC (2). Because of the NSManagedObjectContextDidSave notification MOC 1 will autosave asynchronously.

        if (parsed){
            NSError *error = nil;
            if (![managedObjectContext save:&error])
                NSLog(@"Error while saving parsed data: %@", error);
        }else{
            // something went wrong, discard changes
            [managedObjectContext reset];
        }

        dispatch_async(dispatch_get_main_queue(), ^{                
            // save mainQueueManagedObjectContext
            [[HWOverallDataController sharedOverallDataController] saveMainThreadManagedObjectContext];
        });

To release my memory footprint and because I do not need to parsed data for now I am performing:

[a.managedObjectContext refreshObject:a mergeChanges:NO];

for each A I just parsed.

Because I need to parse about 10 A's which all have about 10 B's which have all about 10 C's a lot of managedObject's are generated.

Problem - Instruments

Everything works fine. The only thing is: when I turn on the Allocations tool I see unreleased A's, B's and C's. I don't get any useful information from their retainCounts or whatsoever. And because my actual problem regards a more complex dataModel the living objects become a serious memory problem. Can someone figure out what I'm doing wrong? Calling refreshObjects on the other managedObjectContexts with the correct managedObject does not work either. Only a hard reset seems to work but then I loose my pointers to living objects used by the UI.

Other solutions I tried

  • I tried creating unidirectional relations instead of bidirectional ones. This create a lot other problems which cause Core Data inconsistencies and weird behavior (such as dangling objects and Core Data generating 1-n relations instead of n-n relations (because the inverse relation is not known).

  • I tried refreshing each changed or inserted object when I retrieve a NSManagedObjectContextDidSave notification on any object

These both 'solutions' (which don't work by the way) seems also a bit hacky. This should not be the way to go. There should be a way of getting this to work without raising the memory footprint and by keeping the UI smooth, though?

- CodeDemo

http://cl.ly/133p073h2I0j

- Further Investigation

After refreshing every object ever used (which is tedious work) in the mainContext (after a mainSave) the object their sizes are reduced to 48 bytes. This indicates that the objects are all faulted, but that there is still a pointer left in memory. When we have about 40.000 objects which are all faulted there is still 1.920 MB in memory which is never released until the persistentManagedObjectContext is reset. And this is something we don't want to do because we loose every reference to any managedObject.

Robin van Dijke
  • 752
  • 4
  • 13
  • That's a nice long description, but without code, I doubt you will get much more than guesses... – Jody Hagins Oct 29 '12 at 20:28
  • In my opinion it's not a problem for which code is very relevant. It's more conceptual. But I could make a small demo project in which you can encouter the problem if that would be necessary. – Robin van Dijke Oct 29 '12 at 20:58
  • I added a XCode project which illustrates the problem – Robin van Dijke Oct 30 '12 at 08:24
  • Wrap your import/parse processing in an @autoreleasepool block so that the pool is cleared for every Object A (and its Bs and Cs). – Rog Oct 30 '12 at 10:12
  • @Rog No, that's probably not it. Wrapping them between an autoreleaseblock does not solve the problem. After more debugging it seems that the mainManagedObjectContext (2) is keeping a strong reference to the A objects. – Robin van Dijke Oct 30 '12 at 10:27

3 Answers3

5

Robin,

I have a similar problem which I solved differently than you have. In your case, you have a third, IMO, redundant MOC, the parent MOC. In my case, I let the two MOCs communicate, in an old school fashion, through the persistent store coordinator via the DidSave notifications. The new block oriented APIs make this much simpler and robust. This lets me reset the child MOCs. While you gain a performance advantage from your third MOC, it isn't that great of an advantage over the SQLite row cache which I exploit. Your path consumes more memory. Finally, I can, by tracking the DidSave notifications, trim items as they are created.

BTW, you are also probably suffering from a massive increase in the size of your MALLOC_TINY and MALLOC_SMALL VM regions. My trailing trimming algorithm lets the allocators reuse space sooner and, hence, retards the growth of these problematic regions. These regions are, in my experience, due to their large resident memory footprint a major cause for my app, Retweever, being killed. I suspect your app suffers the same fate.

When the memory warnings come, I call the below snippet:

[self.backgroundMOC performBlock: ^{ [self.backgroundMOC reset]; }];

[self.moc save];

[self.moc.registeredObjects trimObjects];

-[NSArray(DDGArray) trimObjects] just goes through an array and refreshes the object, thus trimming them.

In summary, Core Data appears to implement a copy on write algorithm for items that appear in many MOCs. Hence, you have things retained in unexpected ways. I focus upon breaking these connections after import to minimize my memory footprint. My system, due to the SQLite row cache, appears to performa acceptably well.

Andrew

adonoho
  • 4,339
  • 1
  • 18
  • 22
  • Thanks for the suggestion. A few questions: What is your MOC structure? A persistentStoreCoordinator with two MOC's attached to it? Or a parent/child relation? And which one of them is the backgroundMOC? – Robin van Dijke Oct 31 '12 at 13:38
  • Both the main MOC, `self.moc`, and the background MOC, `self.backgroundMOC`, use the same store coordinator. Their merge policies are slightly different favoring the main MOC. They of course, communicate using the `DidSave` notifications. – adonoho Oct 31 '12 at 22:53
  • Indeed, that's what I expected. We're now getting our backend ready to use two MOC's in this setup. I'll let you know the result – Robin van Dijke Nov 01 '12 at 08:10
  • Robin, let me add that I also have a few other scratchpad MOCs in my app. They are there to support the UI by caching what their model calculations need. They are also easily reset. Due to memory concerns, I don't make them children of the main MOC. Basically, I view the parent-child MOC pattern as an alternative to the MOC notification pattern, not a replacement pattern. Being able to trim memory on demand gives me some advantages in an app with large resident VM regions. With your 40K items, you share my pain. You really must manage the growth of your resident regions. Andrew – adonoho Nov 01 '12 at 13:04
  • I finally tested everything and this seems to work perfectly. Thank you very much for your response. This was the answer I was looking for – Robin van Dijke Nov 05 '12 at 14:30
1

For every NSManagedObjectContext that you keep around for a specific purpose you are going to accumulate instances of NSManagedObject

A NSManagedObjectContext is just a piece of scratch note paper that you can instantiate at will and save if you wish to keep changes in the NSPersistentStore and then discard afterward.

For the parsing operations (layer 3) try creating a MOC for the op , do your parsing, save the MOC and then discard it afterwards.

It feels like you have at least one layer of MOC being held in strong references too many.

Basically ask the question for each of the MOC's. "Why am keeping this object and its associated children alive".

Warren Burton
  • 17,451
  • 3
  • 53
  • 73
  • This is exactly what I am doing right now. For each parse operation I create a new MOC (3) which is a child of the MOC at level 2 (mainQueue MOC). After parsing the MOC (3) is saved and not strongly referenced anymore (so should be released by ARC). – Robin van Dijke Oct 30 '12 at 08:58
0

I have an import helper that does something very similar.

Have a look at the code below and see if it helps you

__block NSUInteger i = 0;
NSArray *jsonArray = ...
for (NSDictionary *dataStucture in jsonArray)
{
    [managedObjectContext performBlock:^{
        @autoreleasepool {
            i++;
            A *a = (A*)[self newManagedObjectOfType:@"A" inManagedObjectContext:managedObjectContext];
            [self parseData:[dataStucture objectForKey:@"a"]
                 intoObject:a
     inManagedObjectContext:managedObjectContext];

            [managedObjectContext refreshObject:a
                                   mergeChanges:YES];
            if (i > 20) // Arbitrary number here
            {
                NSError *error = nil;
                [managedObjectContext save:&error];
                [managedObjectContext reset];
            }

            [managedObjectContext refreshObject:a
                                   mergeChanges:YES];

        }
        dispatch_async(dispatch_get_main_queue(), ^{
            [self saveMainThreadManagedObjectContext];

            NSLog(@"DONE");
            // parsing is done, now you see that there are still
            // A's, B's and C's left in memory.
            // Every managedObjectContext is saved and no references are kept
            // to any A, B and C so they should be released. This is not true,
            // so a managedObjectContext is keeping a strong reference to these
            // objects.
        });
    }];
}
Rog
  • 18,602
  • 6
  • 76
  • 97
  • Thanks for the suggestion. I tried it, but there are still 2 instances of each object still active (in the mainManagedObjectContext (2) and the persistentManagedObjectContext (3)). – Robin van Dijke Oct 30 '12 at 11:00