-2

I have a single table with a composite primary key consisting of an int4 and a DATE column so it is id, created_at.

I am planning to query this table with a date range, so something like this:

SELECT * 
FROM table 
WHERE id = 4 
  AND created_at BETWEEN '2/2/2022' AND '3/2/2022';

Now my question is what is the best index to use for this scenario? It is a single table that won't be partitioned and all of the rows are insert only, no updates will ever occur.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Rami
  • 490
  • 7
  • 22
  • Have you executed the query and looked at the query plan? If you already have a composite primary key that covers those two fields, I would expect an index seek on that primary key, and no additional index would be required. Please include the query plan in your question – Dijkgraaf Jul 13 '23 at 20:26
  • The question doesn't have enough detail to offer a meaningful recommendation. How many rows are in the table? How selective are the `id` and `created_at` columns? General guidance is to arrange columns in the index definition in descending order of selectivity. – JohnH Jul 13 '23 at 22:09
  • When the two columns id and created_at are the primary key, there is already an index for this combination. Of course you could create additional indexes, but that would be just overhead. Use explain (analyze, verbose, buffers) to get the query plan and all details about the execution of the query – Frank Heikens Jul 14 '23 at 05:22
  • 1
    The questions is pretty straight forward for anyone that has experience with this exact scenario. I have no idea why I received 3 down votes when. What index is best for a table like this and for this exact query. If I go and trial and error this with millions of records then I'll likely lose customers and a lot of money. I don't have any idea why this community is so toxic. – Rami Jul 14 '23 at 09:07

1 Answers1

1

Make id the leading column:

PRIMARY KEY (id, created_at)

The rule of thumb is: equality first, range later. See:

Related:

Also, since ...

It is a single table that won't be partitioned and all of the rows are insert only, no updates will ever occur.

It will be beneficial to CLUSTER the table on the PK index from time to time. CLUSTER rewrites the table according to index order and takes an exclusive lock. So only an option if there are maintenance windows allowing that. Else, look to the community tools pg_repack or pg_squeeze, which can do the same non-blocking. See:

Postgres has to read a minimum of data pages from a table freshly clustered like that. The effect is small for very selective queries, where index performance dominates, but grows with the number of qualifying rows. (CLUSTER and the community tool also rewrite the index in pristine condition.)

Finally, do you really need SELECT *? SELECT id, created_at could be (mostly) served from the index directly with an index-only scan - if you vacuum the table enough.

Aside: always use the recommended ISO date-format, which is unambiguous with any locale setting. '2022-03-02' (or '2022-3-2'), not '3/2/2022'.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228