0

Given two tables:

Data - Contains raw data, per value per day.
-----------
Value | Price | Date | DataType (string)

Data may contain:

Units | Price | Date          | DataType
---------------------------------
10    | 0.99  | March 1, 2014 | "1A"
20    | 0.99  | March 1, 2014 | "1B"
4     | 0.99  | March 1, 2014 | "2"

And may have hundreds of thousands of records.

Normally to query Data I would use a predicate such as (uses MagicalRecord):

NSArray * results = [Data MR_findAllWithPredicate: [NSPredicate predicateWithFormat: @"date > x AND date < y AND dataType IN ('1A', '1B')"]];

// Calculate sums per day.
// .... Loops through results, units * price summed per day.

But with a large amount of rows, this is getting very slow.

There are a few things that I have tried to get this quicker.

  1. Sum via NSExpression. Unfortunately this does not work due to the need of using a multiple:by expression. (Fetch aggregate data from NSManagedObject using another expression as argument to sum: expression for more details).

  2. Now my thoughts are to create a cached table. I'm attempting to use a structure like the following:

CachedData
-----------
Sum | Date | CacheType (integer)

Which I have created and tested inserting data into and it seems to be much faster (some of my calculations went from ~7 seconds to ~0.5s). But the problem is creating this cache table via core data.

If I was using raw SQLite I could do something similar to:

INSERT INTO CachedData
  (CacheType, Sum, Date)

SELECT
  0,
  (Value * Price),
  Date
FROM
  Data
WHERE
  DataType IN ('1A', '1B') AND
  NOT EXISTS(SELECT * FROM CachedData WHERE CacheType = 0 AND CachedData.Date = Data.Date);

Then anytime data had new inserts (never updates, just inserts), I could run this query quickly, having it only update my new data.

In coredata on the other hand, I can't seem to figure out a nice way to do this, other than selecting ALL of the uncached data, selecting all of the cached data and doing a few for loops to do comparisons on existence, then inserting. While this works, it is SLOOOOOOOOW (vs the sqlite query which can be done in seconds).

While thinking about this, I think a simple way would be to add a cached bool column to my data table. Then at least in my enumeration loop I can do a predicate on that column. I'm still interested to know if there is a way to do this via coredata without adding an extra column though.

TL;DR

Is there a way I can get all the records for one table that do NOT exists in another table? Possibly through an NSPredicate subquery?

Community
  • 1
  • 1
Kyle
  • 17,317
  • 32
  • 140
  • 246

1 Answers1

1

First, with your original desire, did you set your attributes to be indexed? This has some overhead for inserts, but will really speed up access like you want.

Second, why run a query when data is inserted. You could easily catch the DidSave notification from the MOC, and do what you want for a newly inserted object, which would include adding into another store. Or, you could use a NSFetchedResultsController and just implement its delegate... which will provide a similar result of watching what's happening in a managed object context... with a probably more familiar API. You don't have to use a FRC with a TableView...

Jody Hagins
  • 27,943
  • 6
  • 58
  • 87