The devil is in the details with this seemingly simple task.
Short and among the fastest:
SELECT col1, col2
FROM (SELECT col1, col2, TRUE AS x1 FROM t1) t1
FULL JOIN (SELECT col1, col2, TRUE AS x2 FROM t2) t2 USING (col1, col2)
WHERE (x1 AND x2) IS NULL;
The FULL [OUTER] JOIN
includes all rows from both sides, but fills in NULL values for columns of missing rows. The WHERE
conditions (x1 AND x2) IS NULL
identifies these unmatched rows. Equivalent: WHERE x1 IS NULL OR x2 IS NULL
.
To eliminate duplicate pairs, add DISTINCT
(or GROUP BY
) at the end - cheaper for few dupes:
SELECT DISTINCT col1, col2
FROM ...
If you have many dupes on either side, it's cheaper to fold before the join:
SELECT col1, col2
FROM (SELECT DISTINCT col1, col2, TRUE AS x1 FROM t1) t1
FULL JOIN (SELECT DISTINCT col1, col2, TRUE AS x2 FROM t2) t2 USING (col1, col2)
WHERE (x1 AND x2) IS NULL;
It's more complicated if there can be NULL values. DISTINCT
/ DISTINCT ON
or GROUP BY
treat them as equal (so dupes with NULL values are folded in the subqueries above). But JOIN
or WHERE
conditions must evaluate to TRUE
for rows to pass. NULL values are not considered equal in this, the FULL [OUTER] JOIN
never finds a match for pairs containing NULL. This may or may not be desirable. You just have to be aware of the difference and define your requirements.
Consider the added demo in the SQL Fiddle
If there are no NULL values, no duplicates, but an additional column defined NOT NULL
in each table, like the primary key, let's name each id
, then it can be as simple as:
SELECT col1, col2
FROM t1
FULL JOIN t2 USING (col1, col2)
WHERE t1.id IS NULL OR t2.id IS NULL;
Related: