0

Schema and sample data

A simple one-to-many parent/child relationship:

CREATE TABLE Parent (
  id VARCHAR PRIMARY KEY
);

CREATE TABLE Child (
  id VARCHAR PRIMARY KEY,
  type INT,
  parent_id VARCHAR REFERENCES Parent(id)
);

INSERT INTO Parent VALUES ('p1'), ('p2'), ('p3'), ('p4');
INSERT INTO Child VALUES ('c1', 0, 'p2'), ('c2', 42, 'p2'), ('c3', 0, 'p3'), ('c4', 42, 'p4');

SQL Fiddle

Queries

We always query for parents with their children. Whenever a particular parent is part of the result, all its children should be retrieved too, in any case.

Use case A)

No restriction on the children - just display all parent/child data, including the parents with no children (p1).

The LEFT JOIN query

SELECT p.*, c.* FROM parent p
LEFT JOIN child c ON c.parent_id = p.id

yields the desired output:

id|id|type|
--|--|----|
p1|  |    |
p2|c1|   0|
p2|c2|  42|
p3|c3|   0|
p4|c4|  42|

Use case B)

There's a restriction on the children, e.g. type = 42. We'd like to retrieve the parents (with all their children) that have any child with type = 42, i.e., the desired output is:

id|id|type|
--|--|----|
p2|c1|   0|
p2|c2|  42|
p4|c4|  42|

Note that c1 is retrieved because its parent p2 has a matching child c2, even though c1's type doesnt' match.

This query with a correlated subquery achieves this result:

SELECT p.*, c.* FROM parent p
JOIN child c ON c.parent_id = p.id
WHERE EXISTS (SELECT 1 FROM child c2 WHERE c2.parent_id = p.id AND c2.type = 42)

Questions

1) Is the second query WHERE EXISTS the standard way to go for use case B)?

2) Since, in real life, there are multiple associations and multiple possible filter conditions, is there a way the entire construction of the query can be simplified, covering both use cases A) and B)? In the sense of a "master query" that can be "enriched" by conditions for various child associations. In case of a single parent table, there is a "master query" of

SELEECT * from Parent p
WHERE 1 = 1

Any conditions on the parent can simply be AND'ed to the WHERE clause of the "master query", making this query construction very easy.

Is there anything comparable for joins with various conditions on various children, retaining the "always return all children for each returned parent" semantics?

shimikano
  • 39
  • 1
  • 4
  • Please in code questions give a [mre]--cut & paste & runnable code; example input (as initialization code) with desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For errors that includes the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) For SQL that includes DBMS & DDL, which includes constraints & indexes & tabular initialization. See [ask], other [help] links & the voting arrow mouseover texts. – philipxy Jan 25 '20 at 11:53
  • If you want to display all parent data, why are you joining to the child table? – Nick Jan 25 '20 at 11:54
  • Your description of what you want to do is not clear. Use enough words, sentences & references to parts of examples to clearly & fully say what you mean. PS Learn what LEFT JOIN ON returns: INNER JOIN ON rows UNION ALL unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of an OUTER JOIN. After an OUTER JOIN ON, a WHERE or INNER JOIN that requires a right [sic] table column to be not NULL removes any rows extended by NULLs, ie leaves only INNER JOIN ON rows, ie "turns OUTER JOIN into INNER JOIN". You have that. – philipxy Jan 25 '20 at 11:58
  • "check if there are filter criteria for the child and if so, I add the WHERE clause" The where filters all the rows of the join, so it filters parent-child row pairs, it doesn't just filter the children. If you want to restrict which children get left joined to parents then you need to apply the child filter to the child table before or while you left join. If you only want parents with children then you can inner join & put the restriction in the on or the where. But you're not clear re what output is desired for what input or what this filtering you talk about is or what your deliverable is. – philipxy Jan 25 '20 at 12:38
  • @philipxy Thank you for your comments. In case you have time to look at it again: I edited the question, hoping to clarify things. – shimikano Jan 25 '20 at 16:48
  • The edit improves. However: Please don't insert EDITs/UPDATEs, just make your post the best presentation as of right now. Adding to something unclear doesn't make it clear. A "basically" or "essentially" or "in other words" that doesn't introduce or summarize a clear, precise & full description that you also give just means "unclearly". Putting words in scare quotes does not make clear the idiosyncratic specific meaning that you didn't write out. Please include everything needed to answer as text in the post itself, not just at a link. Make your post self-contained. (See all of past comments.) – philipxy Jan 26 '20 at 00:50

1 Answers1

0

I think you can modify your second query to move the exists condition to the on clause:

SELECT p.*, c.*
FROM parent p LEFT JOIN
     child c
     ON c.parent_id = p.id AND
        EXISTS (SELECT 1
                FROM child c2
                WHERE c2.parent_id = p.id AND
                      c2.type = 42
               );

That said, this is simpler using window functions:

SELECT p.*, c.*
FROM parent p LEFT JOIN
     (SELECT c.*,
             SUM(CASE WHEN c.type = 42 THEN 1 ELSE 0 END) OVER (PARTITION BY c.parent_id) as num_42
      FROM child c
     ) c
     ON c.parent_id = p.id AND
        num_42 > 0;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you for your reply. I agree that for an OUTER JOIN, it matters whether the EXISTS condition is part of the ON or the WHERE clause... In case of an INNER JOIN however, this should be equivalent, both in terms of result and pervormance, cf. https://stackoverflow.com/questions/1018952. The idea with the window function is interesting - thank you. – shimikano Feb 02 '20 at 00:29