4

The question is really about optimization of sql queries. Let us say we have the table so defined.

CREATE TYPE record_type AS ENUM (
  'TRANSFER',
  'TRADE',
  'VOUCHER'
);

CREATE TYPE record_status AS ENUM (
  'NEW',
  'VALIDATED',
  'EXPIRED'
);

CREATE TABLE good_records (
  id uuid PRIMARY KEY,
  user_id uuid NOT NULL,
  type record_type NOT NULL,
  status record_status NOT NULL,
  amount numeric(36,18) NOT NULL DEFAULT 0,
  expired_at timestamp WITH TIME ZONE NOT NULL,
  notification_sent boolean DEFAULT false,
);

I want to run an expiration check every 10 min, namely, I would run SELECT * FROM good_records where record_status = 'NEW' and notification_sent = false (and SELECT * FROM good_records where record_status = 'VALIDATED' and notification_sent = false). But as I monitor the db resources usage, it comes with no surprise that that two queries costs a lot.

My question is whether it is possible to put indexing on the table somehow so that I can fasten the queries and save db resources.

I have briefly read the postgresql docs but with no good solution.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
hinamizawa
  • 131
  • 1
  • 8
  • 1
    For performance optimization, always start with your version of Postgres. Also essential here: cardinalities for those enum values and rough characteristic of database activity. How many writes per time unit? And: Do you actually need `SELECT *`, or is a limited list of columns good enough? Any indexes there, yet? – Erwin Brandstetter Jun 05 '20 at 03:39

1 Answers1

11

It's certainly possible to index enum columns. But since there are typically only few distinct values, partial indexes are typically more efficient. Details depend on missing information.

For example, assuming there are only few rows with notification_sent = false, and all you need to retrieve is the id, this index would serve both queries:

CREATE INDEX foo ON good_records (record_status, id)
WHERE notification_sent = false;

If there is a lot of write activity, be sure to have aggressive autovacuum settings for the table to keep table and index bloat at bay and allow index-only scans.

Adding id to the index only makes sense if it can give you index-only scans.

If you never filter on id, use the INCLUDE clause instead (Postgres 11 or later). Slightly more efficient:

CREATE INDEX foo ON good_records (record_status) INCLUDE (id)
WHERE notification_sent = false;

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228