5

When the text and blob data type column is used in Innodb Engine. The value which are stored in that field is stored in off-page not stored in page (The default Size of Innodb page is 16kb).

My Questions are 1. What is the off-page mean? 2. How its is accessed while retrieving the value from off-page?

vinieth
  • 1,204
  • 3
  • 16
  • 34

3 Answers3

6

(To continue with what rlanvin and fangxing started, and to say more about about question 2.)

The ROW_FORMAT controls some of the details about which column(s) are stored 'off-page', and also whether the entire column is off-page or whether only the part after the first 767 bytes.

If any or all is stored off-page, a 20-byte "pointer" is left behind. This contains a various fields the have the effect of identifying the 16KB page to go to to find the column value. All pages are 16KB (assuming the default; almost no one ventures into having a different page size). Hence, a big column could take multiple blocks.

The blocks come from 'extents' which are 1MB (or is it 8MB??) allocations taken from the 'tablespace' that contains the table. Originally, ibdata* ware the only tablespaces. Later, with innodb_file_per_table, a table could be in its own tablespace. In the near future (8.0), you will be able to define tablespaces and put whichever tables you desire into it.

When doing SHOW TABLE STATUS (and having file_per_table), notice that the Data_free is 4MB, 5MB, 6MB, or 7MB, reflecting the consumption of an extent.

Where are you headed with the question? One direction could be into optimizations.

  • If you don't SELECT that column, then the overflow pages don't need to be accessed. (Note: This is a strong argument against blindly using SELECT *.)
  • If some big columns could fit, but not all, some will be stored on-page; the rest will be off-page.
  • There are multiple ROW_FORMATs; each is slightly better for certain types of data.
  • If 767 bytes are held in the main page, it seems like LEFT(col, 10) would not have to get the extra pages. I suspect this optimization is missing.
  • There are more possible optimizations, many of which are not implemented.

Why is "off page" beneficial?

  • If you are searching for some rows, the search can (hopefully) take place only in the on-page blocks.
  • When you find the one (or few) row you want, only then do you incur the extra disk hits to reach into the off-page blocks.

A partial, crude, summary of ROW_FORMATs:

  • If the value (TEXT/BLOB/VARCHAR/VARBLOB) is shorter 40 bytes, it is stored on-page.
  • If the value is bigger than 40, but not really big, it is either put on-page or off-page, depending on whether there is enough room for it. That is, larger columns will be moved off-page until the rest will fit.
  • If the value is really big, then COMPACT will put 768 bytes on-page, and add a 20-byte 'pointer' to the rest (which will be off-page)
  • Really big and DYNAMIC or COMPRESSED, then there is only the 20-byte pointer.

KEY_BLOCK_SIZE has some impact on on/off-page.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • 767 bytes in Redundant Row Format. But now default row format is Compact which uses only 40 bytes + 20 bytes for pointer in the column remaining data is stored in off page. Am i saying it right. Because i need to make that i'm clear with this topic. – vinieth Jan 17 '18 at 13:52
  • COMPACT is still 767 bytes. DYNAMIC would be 20. – akuzminsky Jan 17 '18 at 16:03
  • 1
    I added more on `ROW_FORMATs`. – Rick James Jan 17 '18 at 18:18
3
  1. What is the off-page mean

I means that it is stored in a separate area, so it doesn't bloat the main index. This way, the size of the pages can stay identical and exact (the exact storage space required for everything other types is known), and it makes accessing the data faster.

  1. How its is accessed while retrieving the value from off-page?

A 20-byte pointer is stored and it contains the address of the rest of the column.

PS. I think this question would be better suited for DBA

rlanvin
  • 6,057
  • 2
  • 18
  • 24
  • Separate area mean disk page is it right. And I didn't get the second question answer clearly. Can u make it brief. – vinieth Jan 17 '18 at 11:11
  • 1
    Instead of storing the *content* of the column, InnoDB stores a *pointer* to where the content is actually stored. To access it, it reads the address first (the pointer in the main page) and then read the content at this address (in the off page). – rlanvin Jan 17 '18 at 11:22
  • The content is stored in separate area you said. Is it stored in our disk page or ?? – vinieth Jan 17 '18 at 12:00
  • @vinieth - fanxing's Answer picks up where rlanvin leaves off. – Rick James Jan 17 '18 at 12:38
  • @RickJames can u make brief about it. – vinieth Jan 17 '18 at 12:42
  • 1
    @vinieth It is stored on disk in the same [tablespace](https://dev.mysql.com/doc/refman/5.6/en/innodb-tablespace.html) (eg. ibdata1), but not within a *page* (more about [InnoDB blocks, aka "page"](https://dev.mysql.com/doc/internals/en/innodb-page-structure.html)). Because as I explained, InnoDB pages have a fixed size, whereas these columns (BLOB, VARCHAR, etc.) have a variable size. – rlanvin Jan 17 '18 at 12:43
0

From https://dev.mysql.com/doc/refman/5.5/en/innodb-compression-internals.html

In an InnoDB table, BLOB, VARCHAR, and TEXT columns that are not part of the primary key may be stored on separately allocated overflow pages. We refer to these columns as off-page columns. Their values are stored on singly-linked lists of overflow pages.

Fangxing
  • 5,716
  • 2
  • 49
  • 53