0

From PostgreSQL document, when explaining basics of EXPLAIN command:

When dealing with outer joins, you might see join plan nodes with both “Join Filter” and plain “Filter” conditions attached. Join Filter conditions come from the outer join's ON clause, so a row that fails the Join Filter condition could still get emitted as a null-extended row. But a plain Filter condition is applied after the outer-join rules and so acts to remove rows unconditionally. In an inner join there is no semantic difference between these types of filters.

"Join Filter conditions come from the outer join's ON clause". Then in outer join, where does a plain filter condition come from?

Could you give some examples?

Thanks.

Tim
  • 1
  • 141
  • 372
  • 590
  • I think Join Filter condition is what you give in "ON" clause and Plain Filter Condition is what you give in "WHERE" clause. Not sure about it. – Praveen E Jun 21 '18 at 13:35

2 Answers2

2

There is no other use of the term "plain Filter condition" used elsewhere in the Postgres documentation, so I would suspect that the author meant the word "plain" literally like not decorated or elaborate; simple or ordinary in character.

So really they are saying "When a filter is applied in an OUTER JOIN's ON clause the table or derived table being joined is just plainly filtered before the join occurs. This will lead to any columns from this table or derived table in the result set to be null".

JNevill
  • 46,980
  • 4
  • 38
  • 63
1

Here is a little example that might enlighten you:

CREATE TABLE a(a_id) AS VALUES (1), (3), (4);
CREATE TABLE b(b_id) AS VALUES (1), (2), (5);

Now we have to force a nested loop join:

SET enable_hashjoin = off;
SET enable_mergejoin = off;

Our query is:

SELECT *
FROM a
   LEFT JOIN b ON a_id = b_id
WHERE a_id > coalesce(b_id, 0);

 a_id | b_id 
------+------
    3 |     
    4 |     
(2 rows)

The plan is:

                QUERY PLAN
------------------------------------------
 Nested Loop Left Join
   Join Filter: (a.a_id = b.b_id)
   Filter: (a.a_id > COALESCE(b.b_id, 0))
   ->  Seq Scan on a
   ->  Materialize
         ->  Seq Scan on b

The “plain filter” is a condition that is applied after the join.

It is a frequent mistake to believe that conditions in the WHERE clause are the same as conditions in a JOIN … ON clause. That is only the case for inner joins. For outer joins, rows from the outer side that don't meet the condition are also included in the result.

That makes it necessary to have two different filters.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks. Why is there "Materialize" after performing "Seq Scan on b"? Is there a view generated by ` Seq Scan on b` and the view is then materalized? – Tim Jun 28 '18 at 17:39
  • (2) In the top three lines "Nested Loop Left Join", "Join Filter: (a.a_id = b.b_id)", and "Filter: (a.a_id > COALESCE(b.b_id, 0))", how did you know that the third line i.e. the plain filter selection happens after the first two lines i.e. the join, given that the second line i.e. the join filter and the third line i.e. the plain filter are indented in the same way under the first line i.e. the left join. – Tim Jun 28 '18 at 17:53
  • The "materialize" is just for performance (but that's off-topic). The plain filter is only logically after the join filter, in reality they can be executed at the same time. – Laurenz Albe Jun 28 '18 at 20:56
  • Thanks. What is "materialized? There is no view here. – Tim Jun 28 '18 at 21:23
  • It has nothing to do with materialized views. The result of an expensive subselect that is needed repeatedly gets cached to improve performance. – Laurenz Albe Jun 28 '18 at 22:13
  • I see. Is "Seq Scan on b" a subselect? I think it is a scan of all the records in b without any selection. – Tim Jun 29 '18 at 02:18
  • Come on, stop splitting hairs. That wording was from a source comment that I cared to look up for you and referred to the general use of Materialize nodes. The question is answered, I think; no need for a free course on optimizer concepts. – Laurenz Albe Jun 29 '18 at 03:45
  • Thanks. My apology. – Tim Jun 29 '18 at 12:36