8

Many posts like this stackoverflow link claim that there is no concept of a clustered index in PostgreSQL. However, the PostgreSQL documentation contains something similar. A few people claim it is similar to a clustered index in SQL Server.

Do you know what the exact difference between these two is, if there is any?

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
Mahesh V S
  • 552
  • 1
  • 8
  • 23

1 Answers1

12

A clustered index or index organized table is a data structure where all the table data are organized in index order, typically by organizing the table in a B-tree structure.

Once a table is organized like this, the order is automatically maintained by all future data modifications.

PostgreSQL does not have such clustering indexes. What the CLUSTER command does is rewrite the table in the order of the index, but the table remains a fundamentally unordered heap of data, so future data modifications will not maintain that index order.

You have to CLUSTER a PostgreSQL table regularly if you want to maintain an approximate index order in the face of data modifications to the table.

Clustering in PostgreSQL can improve performance, because tuples found during an index scan will be close together in the heap table, which can turn random access to the heap to faster sequential access.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • In SQL Server, all columns of a table are part of clustered index; is that same here? If not, will clustering any how improve the performance? – Mahesh V S Dec 06 '17 at 10:04
  • 1
    No. `CLUSTER` in PostgreSQL will result in a new table and a new index that are just like they were before, except that the table has been reordered. I expanded my answer to explain the performance advantage. – Laurenz Albe Dec 06 '17 at 10:26
  • @LaurenzAlbe Are you saying two tables will be kept, one original and one with ordered columns? If not, then I don't see why you replied 'No' to the first comment. – LoMaPh Oct 18 '19 at 01:43
  • 1
    @LoMaPh A new copy of the table is written, and the old one is removed. The new table will ne a normal heap table just like the original one; it is **not** physically organized like an index. Right after the `CLUSTER` it is physically sorted in index order, but since PostgreSQL rows move around whenever they are modified, the table does not stay in this order. The correlation with the index "rots", and after a while you have to run `CLUSTER` again to get it back in shape. – Laurenz Albe Oct 18 '19 at 06:09