2

I am receiving different results from the following two queries and do not understand why. Shouldn't this NOT IN query return the same result as the NOT EXISTS query?

SELECT p.pid, p.pname, w.salary
FROM Person p, worksFor w
WHERE p.pid = w.pid AND NOT EXISTS (SELECT *
                    FROM worksFor w1,Knows k
                    WHERE w1.pid = k.pid1 AND w1.cname = w.cname AND w1.pid <> w.pid AND k.pid2 = w.pid);

SELECT p.pid, p.pname, w.salary
FROM Person p, worksFor w
WHERE p.pid = w.pid AND 1 NOT IN (SELECT 1
                    FROM worksFor w1, Knows k
                    WHERE w1.pid = k.pid1 AND w1.cname = w.cname AND w1.pid <> w.pid AND k.pid2 = w.pid);
 pid  |  pname   | salary 
------+----------+--------
 1002 | Vidya    |  45000
 1007 | Danielle |  50000
 1010 | Linda    |  55000
 1011 | Nick     |  70000
 1013 | Lisa     |  55000
 1016 | YinYue   |  55000
 1018 | Arif     |  50000
 1019 | John     |  50000

 pid  |  pname   | salary 
------+----------+--------
 1007 | Danielle |  50000
 1012 | Eric     |  50000
 1017 | Latha    |  60000
 1018 | Arif     |  50000
 1019 | John     |  50000
(5 rows)

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 2
    @a_horse_with_no_name, that is not true: `select 1 NOT IN (SELECT 1 from cell_per where line_id > 200); t` – Adrian Klaver Mar 16 '22 at 19:55
  • 2
    Pretty sure this comes down to difference in handling `NULL` or the comparison differences as outlined here [Subquery Expressions](https://www.postgresql.org/docs/current/functions-subquery.html#FUNCTIONS-SUBQUERY-NOTIN). It would help to see the output of the subquery when executed as a stand alone query. – Adrian Klaver Mar 16 '22 at 20:00
  • 1
    @a_horse_with_no_name: For the record: `SELECT 1 NOT IN (SELECT 1 WHERE FALSE);` --> `true`. You forgot the case where the subquery returns *no row*. – Erwin Brandstetter Apr 27 '22 at 18:10

1 Answers1

4

The two given queries should produce the same result since you have SELECT 1 with NOT IN (which makes little sense for IN / NOT IN to begin with).

NOT EXISTS and NOT IN differ when null values are involved in the comparison:

SELECT NOT EXISTS (SELECT NULL::int);  -- false
SELECT 1 NOT IN   (SELECT NULL::int);  -- null !    

NOT EXISTS is false if the bracket expression returns any rows.
NOT IN differs when NULL values are involved. The logic behind it: null values are considered "unknown", their true value might match after all, we just don't know. So the result is null instead of false.

NOT IN is discouraged for this use case.
NOT EXISTS is almost always the better tool with subqueries. Typically faster, too.

See:

My guess is that your displayed NOT IN query differs from what you actually tested. A typical "equivalent" NOT IN formulation for the given NOT EXISTS clause would be:

...
AND (w.cname, w.pid) NOT IN (
   SELECT w1.cname, k.pid2
   FROM   worksFor w1, Knows k
   WHERE  w1.pid = k.pid1
   AND    w1.pid <> w.pid);

"Equivalent" in quotes, because it differs with null values.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228