2

I've been googling back and forth for this, and could not get a grasp of how are the table data blocks structured on the disk.

Many resources state that doing a full table scan reads the blocks sequentially (which means that the DB is able to read multiple block at a time), but i couldn't find any resource actually describing how are blocks kept on disk in the case of a heap VS the case of a clustered index.

Heaps do not dictate order, which reasons with the fact that the DB does not care about the order of blocks that it reads from the disk but:

  1. I still didn't find any evidence which guarantees that heap data is stored sequentially on disk
  2. With a clustered index, the order of the results does matter. In that case, i can't understand how can the DB keep blocks sequentially while still keeping the order. Does sequential reads still hold with a clustered index?

Any resource which describes how blocks are laid out on disk in each case, would help

Yoni Dor
  • 323
  • 3
  • 8
  • 1
    Are you asking about [tag:mysql] or [tag:oracle]? These are two different implementations of an RDBMS. The best answer depends on which one you're asking about. – Bill Karwin Jan 17 '21 at 19:42
  • I actually thought that these 2 would use the same principles, but let's say MySQL for this matter :) @BillKarwin – Yoni Dor Jan 17 '21 at 21:16

1 Answers1

3

You asked about MySQL, and that generally means the InnoDB storage engine, which is the default.

InnoDB does not store tables as a heap.

InnoDB tables are always stored as a clustered index, where the clustered index is the primary key. A table-scan is therefore more or less equivalent to an index-scan of the clustered index.

Any index in InnoDB is not usually stored sequentially on disk. It's stored as a collection of pages, where page have a uniform size of 16KB. The index is obviously much larger than this, and over time insertions and updates expand parts of the index in the middle as well as at the end. To do this efficiently (that is, without needing to rewrite the whole table), random insertions and updates result in the pages being out of order. New pages created are placed wherever there is room in the file.

To facilitate scanning through all the pages, each page contains links to the location of the next page and the preceding page. These may be quite far away in the file, so a table-scan will not actually be sequential, it will involve many seeks to other locations in the file.

InnoDB requires that pages are loaded into RAM before it can actually use them in queries. The InnoDB buffer pool is a fixed-size allocation of RAM, which contains a set of pages loaded from disk. Once the pages are in the buffer pool, they can be accessed very quickly, and with virtually no overhead for following links. The overhead of reading a page from disk into the buffer pool is orders of magnitude much greater than reading a page once it is in RAM.

So in the case of MySQL:

  • There is no heap
  • Sequential order by clustered index has nothing to do with sequential storage on disk
  • Reads are made to pages in RAM anyway, so the physical layout on disk has little to do with the order pages will be read
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks Bill. that's very helpful. Do other engines (not MySql) store the data in a sequential way? how does that work when using a clustered index? – Yoni Dor Jan 17 '21 at 23:20
  • 1
    I'm not deeply familiar with other engines, but I would think that they do not. What would be the advantage of storing data sequentially? At least in an OLTP database, table-scans should be avoided. Optimizing for them is missing the point. Sequential storage would also make it _much_ more expensive to do updates/inserts in the middle of the table, because a single-row update would need to rewrite the whole table from that point to the end. – Bill Karwin Jan 18 '21 at 00:35
  • 1
    Also, a clustered index is an optimization for lookups, not table-scans. I wouldn't expect clustered indexing to be used together with sequential storage. – Bill Karwin Jan 18 '21 at 00:36