The relational operator you require is semi difference a.k.a. antijoin.
Most SQL products lacks an explicit semi difference operator or keyword. Standard SQL-92 doesn't have one (it has a MATCH (subquery)
semijoin predicate but, although tempting to think otherwise, the semantics for NOT MATCH (subquery)
are not the same as for semi difference; FWIW the truly relational language Tutorial D successfully uses the NOT MATCHING
semi difference).
Semi difference can of course be written using other SQL predicates. The most commonly seen are: outer join with a test for nulls in the WHERE
clause, closely followed by EXISTS
or IN (subquery)
. Using EXCEPT
(equivalent to MINUS
in Oracle) is another possible approach if your SQL product supports it and again depending on the data (specifically, when the headings of the two tables are the same).
Personally, I prefer to use EXISTS
in SQL for semi difference join because the join clauses are closer together in the written code and doesn't result in projection over the joined table e.g.
SELECT *
FROM TABLE1 W
WHERE NOT EXISTS (
SELECT *
FROM TABLE2 V
WHERE W.NAME = V.NAME
);
As with NOT IN (subquery)
(same for the outer join approach), you need to take extra care if the WHERE
clause within the subquery involves nulls (hint: if WHERE
clause in the subquery evaluates UNKNOWN due to the presence of nulls then it will be coerced to be FALSE by EXISTS
, which may yield unexpected results).
UPDATE (3 years on): I've since flipped to preferring NOT IN (subquery)
because it is more readable and if you are worried about unexpected results with nulls (and you should be) then stop using them entirely, I did many more years ago.
One way in which it is more readable is there is no requirement for the range variables W
and V
e.g.
SELECT * FROM TABLE1 WHERE name NOT IN ( SELECT name FROM TABLE2 );