3

I know that SQL Server can store a row's data at leaf level in a clustered index. I believe that PostgreSQL doesn't do this. If so, what is its storage paradigm?

My main question is as follows. Consider the following design & data (shown in T-SQL):

CREATE TABLE dbo.Tree
    (
    [Key] int NOT NULL,
    ID int NOT NULL
    ) ON [PRIMARY]
GO
ALTER TABLE dbo.Tree ADD CONSTRAINT
    PK_Tree PRIMARY KEY CLUSTERED 
    (
    [Key],
    ID
    ) WITH (...) ON [PRIMARY]

INSERT INTO TREE ([Key], ID) VALUES (1, 1), (1, 2), (1, 3), (1, 4).

Since this is a btree with both columns as the PK, am I correct in saying that "[Key] = 1" would only be stored once, and "ID = [1, 2, 3, 4]" would be individual values in the btree, while there would be no leaf values per sé since there are no row columns that aren't part of the PK?

How would this work in PostgreSQL?

IamIC
  • 17,747
  • 20
  • 91
  • 154

4 Answers4

14

TL;DR version - your key values are always stored on disk, regardless of DBMS implementation.

PostgreSQL would store 4 rows in page on disk, one for each row that you've inserted. SQL Server will also store 4 rows on disk. The B-tree is the lookup structure, not the page level storage structure.

At the underlying disk level, PostgreSQL uses unordered disk structures to store data. This happens because PostgreSQL may be maintaining multiple copies of a row at any given time due to MVCC transaction semantics. Each row has an xmin and xmax detailing the creation and destruction transaction ID of the current row. The autovacuum process performs ghost record clean up operations. The indexes in PostgreSQL point back to the rows in the heap table structure. This set of slides details the process. In particular you'll want to look at slide 29 for how the b-tree lookup occurs and 48-52 for a theoretical discussion of how the data is stored on disk.

In SQL Server, you'll have records on a leaf page, but with only four rows the clustered index will have just 1 index level - the leaf level. You can verify this by running SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.Tree'), NULL, NULL, NULL). You can also verify the physical page level in SQL Server doing something like this:

-- Locate the first page of the index
DBCC IND('test', 'Tree', 1);
GO
-- tell SQL Server to show DBCC output in the message page, not the SQL Server log
DBCC TRACEON (3604);
GO
-- look at nasty, dirty, on page data.
DBCC PAGE(test, 1,155,3);

Once you look at the DBCC PAGE output, you'll be ready to hate me. Towards the end you should see four rows that look something like this:

Slot 0 Offset 0x60 Length 15

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 15

Memory Dump @0x000000006D6FA060

0000000000000000:   10000c00 01000000 01000000 020000††††...............  

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

Key = 1                              

Slot 0 Column 2 Offset 0x8 Length 4 Length (physical) 4

ID = 1                               

Slot 0 Offset 0x0 Length 0 Length (physical) 0

KeyHashValue = (e2338e2f4a9f)  

This is the actual row data as SQL Server is storing it. You'll see multiple copies of Key = 1 throughout the output, followed by the ID information. Supporting information for these commands can be found here.

The reasoning behind the difference between PostgreSQL and SQL Server comes from PostgreSQL's MVCC implementation. Since we may have multiple copies of a row in PostgreSQL, it's more optimal to keep several copies of the data on disk instead of modifying the supporting index structures. Whenever possible, PostgreSQL does heap-only updates and only issues updates on the underlying table. SQL Server does the same thing and will only update the clustered index (or heap) when it can avoid updating the supporting indexes.

  • Wow, this answer is epic! Thanks @Jeremiah. – IamIC Mar 14 '11 at 17:39
  • I lost you a little at the end, when you say SQL Server does the same thing... they seem fundamentally different. – IamIC Mar 14 '11 at 17:40
  • 1
    Fundementally, SQL Server and PostgreSQL perform similar operations - whenever possible they're only going to update the data in the table and leave any supporting indexes alone. In both cases, this reduces physical I/O. In PostgreSQL this is also because of the MVCC system. Robert Haas wrote about this in [Index-Only Scans](http://rhaas.blogspot.com/2010/11/index-only-scans.html). He's a lot more eloquent than I am about this subject. –  Mar 14 '11 at 17:50
  • Probably worth mentioning that support for index-only scans has recently been added: http://rhaas.blogspot.com/2011/10/index-only-scans-weve-got-em.html – beldaz Oct 21 '11 at 02:28
  • True. Index only scan support has been added, but won't be available until the 9.2 release sometime next year. Even then, index only scans will only be helpful if the index is a covering index; all columns queried must be in the index. Index only scans don't change anything about the way data is written to disk. –  Oct 22 '11 at 14:06
2

You are right - Postgres cannot do what you are asking. see this question for details.

You can achieve clustering of rows using the CLUSTER command, but this does not keep the data clustered once you do DML.

Community
  • 1
  • 1
  • I take it that PostgreSQL must use something similar to SQL Server's heap index. – IamIC Mar 14 '11 at 10:50
  • What is a "heap index"? Postgres reorders the rows in a heap (table) to match the ordering in an index - but the table and index remain separate structures –  Mar 14 '11 at 16:10
  • A heap index or structure: http://msdn.microsoft.com/en-us/library/ms188270.aspx; I'm curious why it would order the rows per a heap, but not per a clustered index (psql def. thereof). – IamIC Mar 14 '11 at 16:13
  • 1
    That link is about normal heap structures, ie tables, not indexes. postgres just does not have clustered tables or index-ordered-tables iin Oracle parlance. The `cluster` command is a one off reorganization of the heap - which can be helpful for reducing IO in certain situations. It shouldn't really be compared to SQL servers clustering but happens to have a similar name. –  Mar 14 '11 at 17:00
2

I know that SQL Server can store a row's data at leaf level in a clustered index. I believe that PostgreSQL doesn't do this. If so, what is its storage paradigm?

Unlike SQL Server and other engines, PostgreSQL does not store the id of the transaction that changed the record in the indexes, only in the heap.

The indexes just point into the heap (and store the ctid of the appropriate record as a row pointer and hence a part of the key).

This means that for each query, even if it could be satisfied by an index lookup, a heap lookup should still be made to ensure the visibility of the data to the current transaction.

Thus said, covering indexes are not that useful in PostgreSQL: since heap lookups should be made anyway, the engine can just take all the data from the heap.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • @Quassnoi logically, this means that SQL Server should be able to outperform PostgreSQL significantly on complex lookups. Do you have any data on this? – IamIC Mar 14 '11 at 18:41
  • 1
    @IanC: heap lookup overhead is one of about `4279` factors affecting the engine performance. Unless you define "complex lookup", it is impossible to tell whether heap lookup overhead is the most important of them or not. – Quassnoi Mar 14 '11 at 18:45
  • @Quassnoi granted :) Let me rather change my question to something even less specific. "As a general rule of thumb, across the boards, from your personal observation, which DB is quicker at queries?" – IamIC Mar 14 '11 at 18:59
  • 3
    @IanC: as a general rule of thumb, across the boards, from my personal observation, both are. This is really not answerable unless you provide the table definitions, data distribution and the query. – Quassnoi Mar 14 '11 at 19:02
  • @Quassnoi a true mathematician :) – IamIC Mar 14 '11 at 19:12
  • @Quassnoi for a small fee, I won't tell ;) – IamIC Mar 14 '11 at 19:20
  • 1
    Like @Quassnoi points out, it depends a lot on schema, data, and queries. PostgreSQL will excel in situations where complex plans abound. The PostgreSQL optimizer will perform exhaustive query optimization out to a certain (configurable) number of joins. Once that limit is reached, it uses a model similar to the one SQL Server uses by default. SQL Server will perform better in situations where index only reads can be used. PostgreSQL might have an advantage under heavy OLTP loads due to the MVCC model (this could cause a lot of locking and blocking in SQL Server). –  Mar 14 '11 at 22:28
  • @Jeremiah: exactly as you say, or, probably, contrariwise. I would not dare comparing query performance without al least seeing the queries. – Quassnoi Mar 14 '11 at 22:38
2

Have a look at my SQL Indexing tutorial if you want to learn more about indexing in general.

Markus Winand
  • 8,371
  • 1
  • 35
  • 44