6

I have the following table and index created:

CREATE TABLE cdc_auth_user
(
  cdc_auth_user_id bigint NOT NULL DEFAULT nextval('cdc_auth_user_id_seq'::regclass),
  cdc_timestamp timestamp without time zone DEFAULT ('now'::text)::timestamp without time zone,
  cdc_operation text,
  id integer,
  username character varying(30)
);

CREATE INDEX idx_cdc_auth_user_cdc_timestamp
          ON cdc_auth_user
       USING btree (cdc_timestamp);

However, when I perform a select using the timestamp field, the index is being ignored and my query takes almost 10 seconds to return:

EXPLAIN SELECT *
          FROM cdc_auth_user
         WHERE cdc_timestamp BETWEEN '1900/02/24 12:12:34.818'
                             AND '2012/02/24 12:17:45.963';


Seq Scan on cdc_auth_user  (cost=0.00..1089.05 rows=30003 width=126)
  Filter: ((cdc_timestamp >= '1900-02-24 12:12:34.818'::timestamp without time zone) AND (cdc_timestamp <= '2012-02-24 12:17:45.963'::timestamp without time zone))
exodar
  • 85
  • 2
  • 6
  • 1
    How many do you have in your table ? It might be because the cost would be greater if it uses the index instead of a table scan. – Luc M Feb 24 '12 at 18:20

1 Answers1

2

If there are a lot of results, the btree can be slower than just doing a table scan. btree indices are really not designed for this kind of "range-selection" kind of query you're doing here; the entries are placed in a big unsorted file and the index is built against that unsorted group, so every result potentially requires a disk seek after it is found in the btree. Sure, the btree can be easily read in order but the results still need to get pulled from the disk.

Clustered indices solve this problem by ordering the actual database records according to what's in the btree, so they actually are helpful for ranged queries like this. Consider using a clustered index instead and see how it works.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • 2
    The CLUSTER command in PostgreSQL is only loosely related to "clustered indexes". What the command does is reorder the rows of a table according to the sort of the index. Does not change anything about the index and it is a one-time operation. The table is not tied to the index in any way. Later writes will be placed as always. CLUSTER can still be very useful. – Erwin Brandstetter Feb 24 '12 at 19:09
  • @Edwin: Hmm.. that's annoying. Are there any kind of clustered btree indices at all in PostgreSQL? – Billy ONeal Feb 24 '12 at 20:56
  • 1
    PostgreSQL does not have the same "clustered indexes" as Oracle does. But if you need that functionality and schedule CLUSTER on a regular basis in a cronjob or triggered by some event, then you should have most of the benefits in most cases. The exception would be heavily written tables where you have no good method to decide when to re-cluster). There is also the `clusterdb`(http://www.postgresql.org/docs/current/interactive/app-clusterdb.html) shell utility. And to be precise: the tables saves the information which index to use for the next CLUSTER without parameter. – Erwin Brandstetter Feb 24 '12 at 21:02
  • @ErwinBrandstetter: I'm not thinking oracle; I'm thinking mssql or mysql. :sigh: I'm surprised it doesn't support this sort of thing. – Billy ONeal Feb 24 '12 at 21:57