35

My tables are set up something like this:

table name: process
fields: name, id_string

table name: value_seach
fields: id_string, value

I want to construct a select statement that will display all of the process names (with it's respective id_string) that do not have an entry in value_search.

The id_string in the process table can be null, and still have a name, but those need to be excluded if possible. The id_string in value_search can never be null

How do I do this?

onedaywhen
  • 55,269
  • 12
  • 100
  • 138
user906153
  • 1,218
  • 8
  • 30
  • 43

4 Answers4

58

In general if you want rows that don't exist in another table, then LEFT JOIN the other table and WHERE ... IS NULL to a column on the second table. Also you mentioned that you don't want rows where process.id_string is NULL.

SELECT p.name, p.id_string
FROM
    process p
    LEFT JOIN value_search v
        ON v.id_string = p.id_string
WHERE
    v.id_string IS NULL
    AND p.id_string IS NOT NULL

This is known as an anti-join.

zgpmax
  • 2,777
  • 15
  • 22
  • This was very helpful. Thank you. – 1man Feb 14 '18 at 19:37
  • Note that if you need to filter the column that you want to find a lack of a join (i.e. v in your example), you'll need to do that in the JOIN clause and not the WHERE. This might be obvious to people with more SQL experience but it was not the first thing I tried, so I thought it might help someone down the line. – Ibrahim Nov 05 '18 at 21:49
29

I believe using Not Exists would be your best option here.

SELECT p.name, p.id_string
FROM process p
WHERE 
   NOT p.id_string IS NULL AND
   NOT EXISTS(
          SELECT NULL
          FROM value_search v
          WHERE p.id_string = v.id_string)
Magnus
  • 45,362
  • 8
  • 80
  • 118
  • 4
    @BrianDriscoll No it wont, `Left outer join` with `null` check and `Not Exists` will create the same query plan even if the left table has null values or not. See: http://explainextended.com/2009/09/17/not-in-vs-not-exists-vs-left-join-is-null-oracle/ for more info. I the values are guaranteed to be `not null`, `Left join`, `Not exists` and `Not in` will have create the same query plan. – Magnus Feb 20 '12 at 18:14
  • 4
    @BrianDriscoll Also note that for many other DBMS, such as SQL server, a `Left Outer Join` with `NULL` check would yield the worst performance since it cannot discern an ANTI JOIN. – Magnus Feb 20 '12 at 18:39
  • A query that also works if you want to check a secondary ID on the where clause in the not exists. – mashtheweb Dec 12 '12 at 10:14
  • 2
    This is a result of testing on a real data. (PostgreSQL 9.4): 1) "left join": explain: "Hash Right Join on t1, Filter t2.id is null, Seq Scan on t2, Hash t1, Seq Scan t1"; Real execution time: 11.9 seconds. 2) "where not exists": explain: "Hash Anti Join, Seq Scan on t1, Hash t2, Seq Scan on t2". Real execution time: 11.9 seconds. Conclusion: Real execution time is the same but query plan for second approach looks better. – bartolo-otrit Aug 31 '15 at 08:42
5

The query you want should look something like this. Note that a JOIN will be significantly faster than a subquery in the WHERE clause.

SELECT p.name, p.id_string
FROM process p
LEFT OUTER JOIN value_search v
   ON p.id_string = v.id_string
   AND p.id_string IS NOT NULL
   AND v.id_string IS NULL

An equally valid variant of the query above would be:

SELECT p.name, p.id_string
FROM process p
LEFT OUTER JOIN value_search v
   ON p.id_string = v.id_string
WHERE
   p.id_string IS NOT NULL
   AND v.id_string IS NULL
Brian Driscoll
  • 19,373
  • 3
  • 46
  • 65
2
SELECT 
  name,
  id_string
FROM process
WHERE id_string IS NOT NULL AND id_string NOT IN SELECT id_string FROM value_seach
alexsuslin
  • 4,130
  • 1
  • 20
  • 30