4

Here's an example of a query plan with Materialize:

Nested Loop  (cost=4.65..49.46 rows=33 width=488)
   Join Filter: (t1.hundred < t2.hundred)
   ->  ... (outer)
   ->  Materialize  (cost=0.29..8.51 rows=10 width=244)
         ->  ... (inner)

And here's an example of a query plan with Hash:

Hash Join  (cost=230.47..713.98 rows=101 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  ... (outer)
   ->  Hash  (cost=229.20..229.20 rows=101 width=244)
         ->  ... (inner)

In the second example the results of the inner query are loaded into a hash table and for each of the outer rows the Hash Join will do a hash table lookup.

In the first example the results of the inner are also loaded into memory (that's what Materialize means, isn't it?), and for each outer row the Nested Loop Join also has to do a lookup for the correct row in the materialized data. What kind of data structure does it use for that? (Clearly not a hash table.)

AndreKR
  • 32,613
  • 18
  • 106
  • 168
  • Possible duplicate of [SQL explain plan: what is Materialize?](http://stackoverflow.com/questions/3029841/sql-explain-plan-what-is-materialize) – freakish Aug 09 '16 at 08:58
  • Check the sources: `postgres-9.5/src/backend/executor/nodeMaterial.c` – joop Aug 09 '16 at 09:03
  • It appears the materialised nodes use the ordinary tuple store, while the hashjoin needs to allocate+manage additional memory to store the complete hash machinery+state. (Just compare the file sizes of the sources ...) – joop Aug 09 '16 at 09:27
  • @joop Indeed, looking at the source answered my question. Thanks. – AndreKR Aug 09 '16 at 09:34

1 Answers1

3

After consulting the source I see that Materialize is basically just a contiguous cache of rows (a tuplestore) that is constantly rewound and iterated again for each of the outer rows.

AndreKR
  • 32,613
  • 18
  • 106
  • 168