0

If I have 2 tables A and B, and do the following query :

SELECT * FROM A FULL OUTER JOIN B ON A.pk = B.fk;

I understand that the number of results will be the sum of 3 types of results :

  1. The number of matching combinations (a,b)
  2. The number of A rows for which there is no matching B : (a, null)
  3. The number of B rows for which there is no matching A : (null, b)

But if if I add a condition on A, for example :

SELECT * FROM A FULL OUTER JOIN B ON A.pk = B.fk WHERE A.field = value;

Compared to the previous query, what will happen to the first group of results ?

For some rows the condition on A is not satisfied. Will those results of type "1" become results of type "3" in the form (null,b), of will these be completely discarded from the results ?

Nicomak
  • 2,319
  • 1
  • 21
  • 23
  • #3 gets dropped, basically it becomes a left outer join instead of a full outer join – Rabbit Dec 04 '15 at 03:52
  • Really ? Why would it drop the right side for values without a matching left side ? I did not specify any condition on the B table, so I think I deserve to see all of them :) – Nicomak Dec 04 '15 at 04:00
  • Would it change something if I added " ... AND B.field = value;" at the end of the query ? – Nicomak Dec 04 '15 at 04:01
  • 1
    It drops #3 because in #3, all the values in A are null. But your where clause is saying A has to equal a value, so it can't be null, therefore it filters out all the null A records. – Rabbit Dec 04 '15 at 09:03
  • If you add that additional filter to the where clause, then you drop #2 and #3. Making the join effectively an inner join. – Rabbit Dec 04 '15 at 09:04
  • This question is technology specific. – Stoleg Dec 04 '15 at 12:16
  • But the FULL OUTER JOIN operation is specified in all SQL standards isn't it ? – Nicomak Dec 04 '15 at 12:27

2 Answers2

0

When you write a query, the different operation sections are written in the order:

SELECT->FROM->WHERE->GROUP BY->HAVING->ORDER BY->LIMIT

But the query is physical evaluated in the order:

FROM->WHERE->GROUP BY->SELECT->HAVING->ORDER BY->LIMIT

So in the case of your FULL OUTER JOIN, when you specify a WHERE clause it is evaluating that clause on the result of the joined tables (usually a temp table is created for this). If you want the behavior to be that the only records from A that are returned match A.field = value, you will need to select those records from A first, then join that result to the B table:

SELECT * 
    FROM (SELECT A.pk
          FROM A
          WHERE A.field = value) A
        FULL OUTER JOIN B 
            ON A.pk = B.fk;

Here is an SqlFiddle so you can see.

Jacob Lambert
  • 7,449
  • 8
  • 27
  • 47
  • So to sum up : if I specify my condition in the ON clause I simply "filter out" data from A before joining and keep all data from B. But if I specify my condition in the WHERE clause, it will waste its time joining the whole tables, and then discard all (a,b) joined values if "a" does not satisfy the result, but also discard all (null, b) joined values because there is no left value to evaluate. – Nicomak Dec 04 '15 at 04:26
  • So when should we use the WHERE clause instead of ON clause ? Seems like you apply any condition with the ON clause while saving performance, no ? – Nicomak Dec 04 '15 at 04:40
  • Ok, so the query you gave as an example in your answer will actually show all values of the table A, but for rows where A.field != value, you will have (a,null) instead of (a,b). It won't only include rows of A which satisfy A.field = value, as you said. According to the accepted answer [here](http://stackoverflow.com/questions/354070/sql-join-where-clause-vs-on-clause). – Nicomak Dec 04 '15 at 05:01
  • 1
    @Nicomak, you are correct, that was my bad. I updated the answer and provided a SqlFiddle for you. – Jacob Lambert Dec 04 '15 at 05:15
  • Ok thanks. The SqlFiddle is cool it helped me test a few things. Thanks for your hard work, I've accepted your answer. I also posted my own answer but it's just for a clear recap of everything. – Nicomak Dec 04 '15 at 12:16
0

RECAP - Conditions used in a Full Outer Join

I've created the simplest scenario with 2 tables containing matching records and non-matching from both sides, and tested all cases, in this sqlfiddle : http://sqlfiddle.com/#!3/901cd2/11

The fiddle was based on the answer of @JRLambert, of whom I've accepted the answer. This answer is just for summarizing and safekeeping.

Here are the SQL queries and explanations of this fiddle (in case it disappears one day) :

-- Basic full join
-- Returns (a,b) for matching records, and also (a, null) and (null, b)
-- for all records from A and B without matches
SELECT * FROM A 
FULL OUTER JOIN B 
ON A.pk = B.fk;

-- With condition in the ON clause
-- Only joins results (a,b) if conditions are satisfied on A. 
-- The example returns :
-- 1. (a,b) only if A and B records are matched, and a.field = 0
-- 2. (a, null) for unmatched A records, and also all values of A with A.field != 0 (even if there is a matching B record)
-- 3. (null, b) for all unmatched B records, or with matching A records not satisfying the condition
SELECT * FROM A 
FULL OUTER JOIN B 
ON A.pk = B.fk AND A.field = 0;

-- With condition in the WHERE clause
-- Joins all matching record first, and return only pairs (a,b) and (a, null) if a satisfied the condition.
-- This example joins as the first "Basic full join", and then only returns rows with a satisfying the condition (meaning cannot be null)
SELECT * FROM A
FULL OUTER JOIN B
ON A.pk = B.fk
WHERE A.field = 0;

-- To select all join results satisfying the condition on A,
-- but keeping all B results as (null,b) whenever the condition is not true on A,
-- preselect keys before the join
SELECT * 
FROM (SELECT A.pk FROM A WHERE A.field = 0) A
FULL OUTER JOIN B
ON A.pk = B.fk;
Nicomak
  • 2,319
  • 1
  • 21
  • 23