11

We have a huge table that contains bloat on the primary key index. We constantly archive old records on that table.

We reindex other columns by recreating the index concurrently and dropping the old one. This is to avoid interfering with production traffic. But this is not possible for a primary key since there are foreign keys depending on it. At least based on what we have tried.

What's the right way to reindex the primary key safely without blocking DML statements on the table?

froi
  • 7,268
  • 5
  • 40
  • 78
  • 1
    Do you want to reset the PK sequences to smaller number or do you want to refresh the underlying Unique Index of the PK? – jlandercy Jan 27 '19 at 11:52
  • 2
    What's wrong with `reindex`? –  Jan 27 '19 at 12:43
  • @a_horse_with_no_name reindex will be blocking potential upcoming transactions/production traffic. and is not that ideal on our case. Running reindex would work, but it will hold incoming transactions until it finishes. – froi Jan 28 '19 at 11:13
  • 1
    @jlandercy Not about resetting the Primary key count. More on just removing the bloat on the primary key index. – froi Jan 28 '19 at 11:18
  • Then create a new unique index (concurrently). When that is finished, drop the old PK and create a new one using the newly created index. Do the "swapping" of the index for the PK in a transaction to avoid invalid data –  Jan 28 '19 at 11:20
  • 1
    In the swapping process, how would you resolve the foreign key dependencies. Right now, it does not allow me to just drop the primary key due to that. Do I have to recreate all the foreign keys as well, then swap? – froi Jan 28 '19 at 11:33
  • This [post](https://stackoverflow.com/questions/37976832/how-to-list-tables-affected-by-cascading-delete) explain how to list FOREIGN KEY dependencies. You could write a pg/plpsql FONCTION that list CONSTRAINT to drop (or use CASCADE) and then to recreate them after having dropped the PRIMARY KEY. – jlandercy Jan 31 '19 at 16:11

3 Answers3

4

REINDEX CONCURRENTLY seems to work as well. I tried it on my database and didn't get any error.

REINDEX INDEX CONCURRENTLY <indexname>;

I think it possibly does something similar to what @jlandercy has described in his answer. While the reindex was running I saw an index with suffix _ccnew and the existing one was intact as well. Eventually I guess that index was renamed as the original index after dropping the older one and I eventually see a unique primary index on my table.

I am using postgres v12.7.

amit_saxena
  • 7,450
  • 5
  • 49
  • 64
  • `CONCURRENTLY` option is avaliable since V12, previous versions doesn't have this possibility – EAmez May 30 '23 at 11:02
2

TL;DR

Just reindex it as other index using its index name:

REINDEX INDEX <indexname>;

MCVE

Let's create a table with a Primary Key constraint which is also an Index:

CREATE TABLE test(
    Id BIGSERIAL PRIMARY KEY
);

Looking at the catalogue we see the constraint name:

SELECT conname FROM pg_constraint WHERE conname LIKE 'test%';
-- "test_pkey"

Having the name of the index, we can reindex it:

REINDEX INDEX test_pkey;

You can also fix the Constraint Name at the creation:

CREATE TABLE test(
    Id BIGSERIAL NOT NULL
);
ALTER TABLE test ADD CONSTRAINT myconstraint PRIMARY KEY(Id);

If you must address concurrence, then use the method a_horse_with_no_name suggested, create a unique index concurrently:

-- Ensure Uniqueness while recreating the Primary Key:
CREATE UNIQUE INDEX CONCURRENTLY tempindex ON test USING btree(Id);
-- Drop PK:
ALTER TABLE test DROP CONSTRAINT myconstraint;
-- Recreate PK:
ALTER TABLE test ADD CONSTRAINT myconstraint PRIMARY KEY(Id);
-- Drop redundant Index:
DROP INDEX tempindex;

To check Index existence:

SELECT * FROM pg_index WHERE indexrelid::regclass = 'tempindex'::regclass
jlandercy
  • 7,183
  • 1
  • 39
  • 57
  • You can do: `ALTER TABLE test DROP CONSTRAINT myconstraint, ADD CONSTRAINT myconstraint PRIMARY KEY(Id) using index tempindex;`. It is transactional and there is no need for manual `drop index`. – user158037 Jan 23 '20 at 15:17
  • Is it possible to reindex concurrently the primary key? – Leonardo Feb 01 '21 at 18:31
2

You can use pg_repack for this.

pg_repack is a PostgreSQL extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes.

It doesn't hold exclusive locks during the whole process. It still does execute some locks, but this should be for a short period of time only. You can check the details here: https://reorg.github.io/pg_repack/

To perform repack on indexes, you can try:

pg_repack -t table_name --only-indexes
len
  • 426
  • 3
  • 11