-1

I am trying to insert 300,000,000 records into an AWS aurora postgres database table. The table has about 170 columns.

I am using a tool which uses multiple threads to insert in parallel.

After about 92,000,000 records, the insertions suddenly hangs (before hanging, tool inserts about 1500 records per second), and a single insertion (even when executed through psql) takes about 30 minutes to complete.

  • If I truncate the table and retry, or use a new aurora instance, the same happens exactly around the same record count.
  • Restarting the DB doesn't have an effect.
  • Increasing the DB instance size doesn't have an effect.
  • Removing all the indexes and constraints (except for the primary key) didn't have an effect.
  • Insertions into other tables work fine while the insertions into the original table hangs.
  • If I use a non aurora postgres RDS, the same issue occurs at around 85,000,000 records.

Following log entry is logged repeatedly from the time the issue occurred.

2023-08-21 13:36:49 UTC:10.142.211.20(36632):mit@postgres:[13997]:LOG:  still searching for an unused OID in relation "pg_toast_16720"
2023-08-21 13:36:49 UTC:10.142.211.20(36632):mit@postgres:[13997]:DETAIL:  OID candidates have been checked 1000000 times, but no unused OID has been found yet.

Following is a screenshot of RDS performance Insights dashboard enter image description here

What could be causing this sudden slowness of inserts?

Lahiru Chandima
  • 22,324
  • 22
  • 103
  • 179
  • Are you inserting the records in batches? e.g. https://stackoverflow.com/a/67092893/2275388 if applicable, by that amount of records `copy` would be a better choice – Jim Jones Aug 22 '23 at 06:47

1 Answers1

1

I think a table can only have 2^32 (or maybe 2^31) toasted values in it. It seems like you are approaching that limit. 2^32/92,000,000 = 46.7, so apparently about 46 of your columns are TOASTed for each row, on average. Does that seem plausible?

Maybe you need to partition your data, or think about redesigning your schema in some other way.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • Thanks for the answer. My table doesn't have any large columns. Largest column in the db is varchar(160). Is it still possible that TOASTs are used for the table? – Lahiru Chandima Aug 23 '23 at 06:31
  • Ok, I got it working by changing storage type of all varchar columns to `plain` (it was `extended` before, and was the default storage type for varchar). So it looks like `extended` storage column values use TOAST right? – Lahiru Chandima Aug 23 '23 at 08:35
  • Yes. `plain` can also use toast, but only as a last resort. Apparently that is good enough for you here, but it still sends pretty fragile to operate so close to the edge – jjanes Aug 23 '23 at 10:05