Why are double nested NOT EXISTS statements unavoidable in SQL

Question

This is more out of curiosity/scientific interest than based on a real problem.

A programming language is supposed to be a tool and tools are made to make working easier. So why is it that you can find all entries in a one table by simply doing SELECT * FROM foo WHERE bar=42; but as soon as there are multiple tables involved, there is no easy/intuitive way to say "find all tuples meeting this criteria"?

The example is about components which are shipped to different projects in different cities. Primary keys in bold.

component(**CNR**,Cname,Color,Weight,City)
project(**PNR**,Pname,City)
scp(**SNR**,**CNR**,**PNR**,Quantity)

The task is to write a query to find all components that are/were shipped to all projects in one specific city.

The given solution:

SELECT CNR
FROM components
WHERE NOT EXISTS (
    SELECT 1
    FROM project
    WHERE project.city = 'Foobar Town'
    AND NOT EXISTS (
        SELECT 1
        FROM scp
        WHERE scp.PNR = projekt.PNR
        AND scp.CNR = components.CNR ));

It is correct, makes sense and even works. But it is not intuitive--a straightforward statement of "get me this". Why?

The example and all the material in our course is based on SQL92.

This is one possible solution. If you want to use a more intuitive solution, use one - but if you want us to find one then you'll have to define 'intuitive' — Strawberry, Dec 31 '14 at 14:26
Seems to me the reason is that your solution needs to prove a negative. SQL is set up to be easy to get positives: "Get all results that meet this condition". But you need to "Get all results where there are no exceptions to this condition". If you can think of a way to re-invent SQL so it's set up to do both intuitively, you could be the next billionaire, but I don't think it will be as easy as you think it should be. — Tab Alleman, Dec 31 '14 at 14:37
The "given solution" doesn't do what you think it does. Build some tables and test it. — Mike Sherrill 'Cat Recall', Dec 31 '14 at 14:43
This is just my opinion, but I feel as if this is a query that could be rewritten using a `JOIN`, which may appear more intuitive. It also depends on your definition of 'simpler'. Sometimes, the least amount of code isn't always the most readable. — AdamMc331, Dec 31 '14 at 14:49
If I'm not mistaken the query does what is known as relational division, and why that particular relational operation never has been implemented is a mystery to me; we did get union, intersection and difference, but never division. Maybe it's too hard to implement. It would have been nice to do `set1 divided by set2` though. — jpw, Dec 31 '14 at 14:52
I see little resemblance between the stated task and the given solution. — Dan Bracuk, Dec 31 '14 at 16:30
A better example for relational division would be: find a pair(set) of persons who rented exactly the same **set** of {cars, movies, hotelrooms} — wildplasser, Dec 31 '14 at 16:44
Does this answer your question? [Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/questions/33947260/is-there-any-rule-of-thumb-to-construct-sql-query-from-a-human-readable-descript) — philipxy, Jan 14 '23 at 23:04
@wildplasser Relational division asks for rows associated with at least the set & with at least 1 element of the set. There are so many queries like but not quite division, with the particulars of division easily misunderstood, that instead of expressing in division it is likely better to just always use exists and/or set comparisons--appropriately commented. (Or in simple cases SQL shorthands like group/having/count.) — philipxy, Jan 14 '23 at 23:18

score 3 · Accepted Answer · answered Dec 31 '14 at 15:35

Your question is: "Find all components that are/were shipped to all projects in one specific city." You are rephrasing this as "Find all components where there is no project in a given city that doesn't have the component."

I'd be more inclined to answer this directly:

select scp.component
from scp join
     projects p
     on scp.pnr = p.pnr
where p.city = 'Foobar Town'
group scp.component
having count(distinct scp.pnr) = (select count(distinct pnr)
                                  from projects
                                  where city = 'Foobar Town'
                                 );

This counts the number of distinct projects in the city and compares them to the number of projects in the city (the distinct id probably not necessary in the subquery.

First, I'm not sure if this is simpler. Second, I'm the first to admit that the NOT EXISTS method may be more efficient, although the nesting of NOT EXISTS in subqueries may be detrimental to performance. I do, however, think that this is easier to follow.

Why are double nested NOT EXISTS statements unavoidable in SQL

1 Answers1

Linked