0

After I read this post, I found a third way to achieve the query with NOT ANY. Does it really makes a difference?

Which of these queries run the faster?

NOT EXISTS

SELECT DISTINCT U.no FROM U
INNER JOIN PUF ON PUF.noU = U.no
WHERE NOT EXISTS (
    SELECT DISTINCT PUF.noU FROM F
    INNER JOIN PUF AS PUF2 ON PUF2.noF = F.no
    WHERE VILLE <> 'Lausanne' AND PUF.noU = PUF2.noU
)

NOT attribute = ANY

SELECT DISTINCT U.no FROM U
INNER JOIN PUF ON PUF.noU = U.no
WHERE NOT PUF.noU = ANY (
    SELECT DISTINCT PUF.noU FROM F
    INNER JOIN PUF ON PUF.noF = F.no
    WHERE VILLE <> 'Lausanne'
)

NOT IN (The initial one)

SELECT DISTINCT U.no FROM U
INNER JOIN PUF ON PUF.noU = U.no
WHERE PUF.noU NOT IN (
    SELECT DISTINCT PUF.noU FROM F
    INNER JOIN PUF ON PUF.noF = F.no
    WHERE VILLE <> 'Lausanne'
)

If they're equivalent, should I use NOT EXISTS instead of NOT attribute = ANY () ?

Olivier D'Ancona
  • 779
  • 2
  • 14
  • 30

1 Answers1

2

The one construct to avoid in most cases is NOT IN, especially NOT IN (subquery).
NOT attribute = ANY (subquery) is equally inferior.

Those yield "surprising" results when involving NULL values, and there is almost always a superior (correct, faster, less deceiving) formulation with NOT EXISTS or LEFT JOIN / IS NULL. See:

Either way, DISTINCT is useless noise inside an EXISTS expression. Postgres stops evaluation after the first find anyway.
You can probably simplify to this:

SELECT DISTINCT U.no
FROM   U
JOIN   PUF ON PUF.noU = U.no
WHERE  NOT EXISTS (
   SELECT FROM F
   WHERE  F.no = PUF.noF
   AND    F.VILLE <> 'Lausanne'
   );

Maybe you can simplify further. The JOIN may not be needed, either. I would have to know exact table definitions and the exact intention of the query to be sure.

That said, the proof of the pudding is in the eating. After verifying that all variants are indeed equivalent (NULL values?), test performance for each (properly) and compare. See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228