1. Will this query execute the SELECT Id from PhraseSource
for every row?
No.
In SQL you express what you want to do, not how you want it to be done1. The engine will create an execution plan to do what you want in the most performant way it can.
For your query, executing the query for each row is not necessary. Instead the engine will create an execution plan that executes the subquery once, then does a left anti-semi join to determine what IDs are not present in the PhraseSource
table.
You can verify this when you include the Actual Execution Plan in SQL Server Management Studio.
2. Is there a more efficient way that this could be coded?
A little bit more efficient, as follows:
DELETE
p
FROM
Phrase AS p
WHERE
NOT EXISTS (
SELECT
1
FROM
PhraseSource AS ps
WHERE
ps.Id=p.PhraseId
);
This has been shown in tests done by user Aaron Bertrand on sqlperformance.com: Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?:
Conclusion
[...] for the pattern of finding all rows in table A where some condition does not exist in table B, NOT EXISTS is typically going to be your best choice.
Another benefit of using NOT EXISTS
with a correlated subquery is that it does not have problems when PhraseSource.Id
can be NULL
. I suggest you read up on IN/NOT IN
vs NULL
values in the subquery. E.g. you can read more about that on sqlbadpractices.com: Using NOT IN operator with null values.
The PhraseSource.Id
column is probably not nullable in your schema, but I prefer using a method that is resilient in all possible schemas.
1. Exceptions exist when forcing the engine to use a specific path, e.g. with Table Hints or Query Hints. The engine doesn't always get things right.