4

in the wiki for PostgreSQL related to TOASTed tables (https://wiki.postgresql.org/wiki/TOAST) says that:

"You cannot have more than 2^32 (4 billion) out-of-line values in a single table, because there would have to be duplicated OIDs in its TOAST table."

What does it mean?

1) the TOAST table cannot have more than 4 billion rows? or

2) the TOAST table cannot have more than 4 billion distinct values of OIDs (values for column chunk_id)?

We have a toast table with 3.2 billion rows and wondering if we are close to the limits.

Thanks in advance

Fercho
  • 43
  • 4

1 Answers1

5

TOAST tables are defined like this:

\d pg_toast.pg_toast_59238
TOAST table "pg_toast.pg_toast_59238"
   Column   |  Type   
------------+---------
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea

Here chunk_id is the identifier of a single toasted value, and chunk_seq is the index for the parts into which the toasted value has been split.

Since there are only around 4 billion different unsigned 4-byte integers, and that is what the data type oid is, there can be only 4 billion toasted data in each database table.

However, not each entry in a table gets toasted: only when the size of a table row exceeds 2000 bytes after compression, values are stored out of line.

You can fund the TOAST table for your table:

SELECT reltoastrelid
FROM pg_class
WHERE relname = 'mytable';

Then you can find how many toasted entries there are:

SELECT count(DISTINCT chunk_id)
FROM pg_toast.pg_toast_12345;

Warning: That is an expensive query.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • does "4 billion toasted data in each database table" mean "4 billion at any moment", or "4 billion including previously created and deleted"? – Andrey B. Panfilov May 02 '23 at 09:39
  • @AndreyB.Panfilov Four billion including all deleted or updated rows that have not yet been cleaned up by `VACUUM`. Normally, autovacuum will take care of this garbage soon. – Laurenz Albe May 02 '23 at 10:31
  • sorry, that is bit unclear, does `vacuum` somehow compact `oid`'s or there is some technique which is filling gaps or storing information about unused `oid`'s – Andrey B. Panfilov May 02 '23 at 10:47
  • @AndreyB.Panfilov PostgreSQL will pick an `oid` that is not already in use. So it is filling gaps. If your tables are big enough that you feel like asking this question, you should partition them. That gets rid of the limit. – Laurenz Albe May 02 '23 at 12:09