The following is an alternative approach, which does not use SUBQUERY, but should ultimately have the same results. I've no idea whether this would in practice be more or less efficient than using SUBQUERY.
Your specific concern is the inefficiency of counting all the matching Authors, rather than just stopping when the first matching Author is found. Bear in mind that there is a lot going on behind the scenes whenever a fetch request is processed. First, CoreData has to parse your predicate and turn it into an equivalent SQLite query, which will look something like this:
SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, ... FROM ZBOOK t0 WHERE (SELECT COUNT(t1.Z_PK) FROM ZAUTHOR t1 WHERE (t0.Z_PK = t1.ZBOOK AND ( t1.ZFIRST == 'Sarah' AND t1.ZLAST == 'Monarch')) ) > 0
(The precise query will depend on whether the inverse relationship is defined and if so whether it is one-many or many-many; the above is based on the relationship having a to-one inverse, book
).
When SQLite is handed that query to execute, it will check what indexes it has available and then invoke the query planner to determine how best to process it. The bit of interest is the subselect to get the count:
SELECT COUNT(t1.Z_PK) FROM ZAUTHOR t1 WHERE (t0.Z_PK = t1.ZBOOK AND ( t1.ZFIRST == 'Sarah' AND t1.ZLAST == 'Monarch'))
This is the bit of the query which corresponds to your first code snippet. Note that it is in SQLite terms a correlated subquery: it includes a parameter (t0.Z_PK - this is essentially the relevant Book) from the outer SELECT statement. SQLite will search the entire table of Authors, looking first to see whether they are related to that Book and then to see whether the author first and last names match. Your proposition is that this is inefficient; the nested select can stop as soon as any matching Author is found. In SQLite terms, that would correspond to a query like this:
SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, ... FROM ZBOOK t0 WHERE EXISTS(SELECT 1 FROM ZAUTHOR t1 WHERE (t0.Z_PK = t1.ZBOOK AND ( t1.ZFIRST == 'Sarah' AND t1.ZLAST == 'Monarch')) )
(It's unclear from the SQLite docs whether the EXISTS operator actually shortcuts the underlying subselect, but this answer elsewhere on SO suggests it does. If not it might be necessary to add "LIMIT 1" to the subselect to get it to stop after one row is returned). The problem is, I don't know of any way to craft a CoreData predicate which would be converted into a SQLite query using the EXISTS operator. Certainly it's not listed as a function in the NSExpression
documentation, nor is it mentioned (or listed as a reserved word) in the predicate format documentation. Likewise, I don't know of any way to add "LIMIT 1" to the subquery (though it's relatively straightforward to add to the main fetch request using fetchLimit.
So not much scope for addressing the issue you identify. However, there might be other inefficiencies. Scanning the Author table for each Book in turn (the correlated subquery) might be one. Might it be more efficient to scan the Author table once, identify those that meet the relevant criteria (first
= "Sarah" and last
= "Monarch"), then use that (presumably much shorter) list to search for the books? As I said at the start, that's an open question: I have no idea whether it is or isn't more efficient.
To pass results of one fetch request to another, use NSFetchRequestExpression
. It's a bit arcane but hopefully the following code is clear enough:
let authorFetch = Author.fetchRequest()
authorFetch.predicate = NSPredicate(format: "#first == 'Sarah' AND #last == 'Monarch'")
authorFetch.resultType = .managedObjectIDResultType
let contextExp = NSExpression(forConstantValue: self.managedObjectContext)
let fetchExp = NSExpression(forConstantValue: authorFetch)
let fre = NSFetchRequestExpression.expression(forFetch: fetchExp, context: contextExp, countOnly: false)
let bookFetch = Book.fetchRequest()
bookFetch.predicate = NSPredicate(format: "ANY authors IN %@", fre)
let results = try! self.managedObjectContext!.fetch(bookFetch)
The result of using that fetch is a SQL query like this:
SELECT DISTINCT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, ... FROM ZBOOK t0 JOIN ZAUTHOR t1 ON t0.Z_PK = t1.ZBOOK WHERE t1.Z_PK IN (SELECT n1_t0.Z_PK FROM ZAUHTOR n1_t0 WHERE ( n1_t0.ZFIST == 'Sarah' AND n1_t0.ZLAST == 'Monarch')
This has its own complexities (a DISTINCT, a JOIN, and a subselect) but importantly the subselect is no longer correlated: it is independent of the outer SELECT so can be evaluated once rather than being re-evaluated for each row of the outer SELECT.