Nested Join vs Merge Join vs Hash Join in PostgreSQL

Question

I know how the

Nested Join
Merge Join
Hash Join

works and its functionality.

I wanted to know in which situation these joins are used in Postgres

Please see my detailed answer here: https://stackoverflow.com/a/10657906/905902 (note: that was for Pg8.4, but the *cost based* query planner has not changed that much) — wildplasser, Feb 28 '18 at 13:45

Laurenz Albe · Answer 1 · 2019-10-09T08:10:00.323

The following are a few rules of thumb:

Nested loop joins are preferred if one of the sides of the join has few rows. Nested loop joins are also used as the only option if the join condition does not use the equality operator.
Hash Joins are preferred if the join condition uses an equality operator and both sides of the join are large and the hash fits into work_mem.
Merge Joins are preferred if the join condition uses an equality operator and both sides of the join are large, but can be sorted on the join condition efficiently (for example, if there is an index on the expressions used in the join column).

A typical OLTP query that chooses only one row from one table and the associated rows from another table will always use a nested loop join as the only efficient method.

Queries that join tables with many rows (which cannot be filtered out before the join) would be very inefficient with a nested loop join and will always use a hash or merge join if the join condition allows it.

The optimizer considers each of these join strategies and uses the one that promises the lowest costs. The most important factor on which this decision is based is the estimated row count from both sides of the join. Consequently, wrong optimizer choices are usually caused by misestimates in the row counts.

**cannot be sorted on the join criteria efficiently.** - Can you explain still little bit brief about that line. Because I confuse with Merge and Hash — vinieth, Feb 28 '18 at 12:19
IMO hash joins are preferred if the resulting hash table is *assumed* to fit (more or less) into work_mem, and the other methods are *assumed* to cost more.This often causes small or new tables to be hash-joined. — wildplasser, Feb 28 '18 at 12:20
@wildplasser I agree and have changed the answer accordingly. My answer is not a complete explanation, the question is very general. — Laurenz Albe, Feb 28 '18 at 17:14

Nested Join vs Merge Join vs Hash Join in PostgreSQL

1 Answers1