(To continue with what rlanvin and fangxing started, and to say more about about question 2.)
The ROW_FORMAT
controls some of the details about which column(s) are stored 'off-page', and also whether the entire column is off-page or whether only the part after the first 767 bytes.
If any or all is stored off-page, a 20-byte "pointer" is left behind. This contains a various fields the have the effect of identifying the 16KB page to go to to find the column value. All pages are 16KB (assuming the default; almost no one ventures into having a different page size). Hence, a big column could take multiple blocks.
The blocks come from 'extents' which are 1MB (or is it 8MB??) allocations taken from the 'tablespace' that contains the table. Originally, ibdata*
ware the only tablespaces. Later, with innodb_file_per_table
, a table could be in its own tablespace. In the near future (8.0), you will be able to define tablespaces and put whichever tables you desire into it.
When doing SHOW TABLE STATUS
(and having file_per_table), notice that the Data_free
is 4MB, 5MB, 6MB, or 7MB, reflecting the consumption of an extent.
Where are you headed with the question? One direction could be into optimizations.
- If you don't
SELECT
that column, then the overflow pages don't need to be accessed. (Note: This is a strong argument against blindly using SELECT *
.)
- If some big columns could fit, but not all, some will be stored on-page; the rest will be off-page.
- There are multiple
ROW_FORMATs
; each is slightly better for certain types of data.
- If 767 bytes are held in the main page, it seems like
LEFT(col, 10)
would not have to get the extra pages. I suspect this optimization is missing.
- There are more possible optimizations, many of which are not implemented.
Why is "off page" beneficial?
- If you are searching for some rows, the search can (hopefully) take place only in the on-page blocks.
- When you find the one (or few) row you want, only then do you incur the extra disk hits to reach into the off-page blocks.
A partial, crude, summary of ROW_FORMATs
:
- If the value (
TEXT
/BLOB
/VARCHAR
/VARBLOB
) is shorter 40 bytes, it is stored on-page.
- If the value is bigger than 40, but not really big, it is either put on-page or off-page, depending on whether there is enough room for it. That is, larger columns will be moved off-page until the rest will fit.
- If the value is really big, then
COMPACT
will put 768 bytes on-page, and add a 20-byte 'pointer' to the rest (which will be off-page)
- Really big and
DYNAMIC
or COMPRESSED
, then there is only the 20-byte pointer.
KEY_BLOCK_SIZE
has some impact on on/off-page.