Given two tables:
Data - Contains raw data, per value per day. ----------- Value | Price | Date | DataType (string)
Data may contain:
Units | Price | Date | DataType --------------------------------- 10 | 0.99 | March 1, 2014 | "1A" 20 | 0.99 | March 1, 2014 | "1B" 4 | 0.99 | March 1, 2014 | "2"
And may have hundreds of thousands of records.
Normally to query Data
I would use a predicate such as (uses MagicalRecord):
NSArray * results = [Data MR_findAllWithPredicate: [NSPredicate predicateWithFormat: @"date > x AND date < y AND dataType IN ('1A', '1B')"]];
// Calculate sums per day.
// .... Loops through results, units * price summed per day.
But with a large amount of rows, this is getting very slow.
There are a few things that I have tried to get this quicker.
Sum via
NSExpression
. Unfortunately this does not work due to the need of using amultiple:by
expression. (Fetch aggregate data from NSManagedObject using another expression as argument to sum: expression for more details).Now my thoughts are to create a cached table. I'm attempting to use a structure like the following:
CachedData ----------- Sum | Date | CacheType (integer)
Which I have created and tested inserting data into and it seems to be much faster (some of my calculations went from ~7 seconds to ~0.5s). But the problem is creating this cache table via core data.
If I was using raw SQLite I could do something similar to:
INSERT INTO CachedData (CacheType, Sum, Date) SELECT 0, (Value * Price), Date FROM Data WHERE DataType IN ('1A', '1B') AND NOT EXISTS(SELECT * FROM CachedData WHERE CacheType = 0 AND CachedData.Date = Data.Date);
Then anytime data had new inserts (never updates, just inserts), I could run this query quickly, having it only update my new data.
In coredata on the other hand, I can't seem to figure out a nice way to do this, other than selecting ALL of the uncached data, selecting all of the cached data and doing a few for loops to do comparisons on existence, then inserting. While this works, it is SLOOOOOOOOW (vs the sqlite query which can be done in seconds).
While thinking about this, I think a simple way would be to add a cached
bool column to my data table. Then at least in my enumeration loop I can do a predicate on that column. I'm still interested to know if there is a way to do this via coredata without adding an extra column though.
TL;DR
Is there a way I can get all the records for one table that do NOT exists in another table? Possibly through an NSPredicate
subquery?