53

I know how the

  1. Nested Join
  2. Merge Join
  3. Hash Join

works and its functionality.

I wanted to know in which situation these joins are used in Postgres

vinieth
  • 1,204
  • 3
  • 16
  • 34
  • 2
    https://use-the-index-luke.com/sql/join –  Feb 28 '18 at 07:11
  • Please see my detailed answer here: https://stackoverflow.com/a/10657906/905902 (note: that was for Pg8.4, but the *cost based* query planner has not changed that much) – wildplasser Feb 28 '18 at 13:45

1 Answers1

85

The following are a few rules of thumb:

  • Nested loop joins are preferred if one of the sides of the join has few rows. Nested loop joins are also used as the only option if the join condition does not use the equality operator.

  • Hash Joins are preferred if the join condition uses an equality operator and both sides of the join are large and the hash fits into work_mem.

  • Merge Joins are preferred if the join condition uses an equality operator and both sides of the join are large, but can be sorted on the join condition efficiently (for example, if there is an index on the expressions used in the join column).

A typical OLTP query that chooses only one row from one table and the associated rows from another table will always use a nested loop join as the only efficient method.

Queries that join tables with many rows (which cannot be filtered out before the join) would be very inefficient with a nested loop join and will always use a hash or merge join if the join condition allows it.

The optimizer considers each of these join strategies and uses the one that promises the lowest costs. The most important factor on which this decision is based is the estimated row count from both sides of the join. Consequently, wrong optimizer choices are usually caused by misestimates in the row counts.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • 1
    That is an index on the column used in the join condition. – Laurenz Albe Feb 28 '18 at 11:17
  • **cannot be sorted on the join criteria efficiently.** - Can you explain still little bit brief about that line. Because I confuse with Merge and Hash – vinieth Feb 28 '18 at 12:19
  • IMO hash joins are preferred if the resulting hash table is *assumed* to fit (more or less) into work_mem, and the other methods are *assumed* to cost more.This often causes small or new tables to be hash-joined. – wildplasser Feb 28 '18 at 12:20
  • 1
    @wildplasser I agree and have changed the answer accordingly. My answer is not a complete explanation, the question is very general. – Laurenz Albe Feb 28 '18 at 17:14