2

I have 2 tables:

account_transaction:

+-------------------------------+--------------------------+------------------------+
|            Column             |           Type           |       Modifiers        |
+-------------------------------+--------------------------+------------------------+
| id                            | integer                  | not null               |
| account_id                    | bigint                   | not null               |
| created                       | timestamp with time zone | not null default now() |
| transaction_type              | text                     | not null               |
| amount                        | numeric(5,2)             | not null               |
| external_reference_id         | character varying(60)    |                        |
+-------------------------------+--------------------------+------------------------+

Indexes:

"idx_account_transaction_created" btree (created)

reporting_period:

+------------+--------------------------+-----------+
|   Column   |           Type           | Modifiers |
+------------+--------------------------+-----------+
| month      | text                     |           |
| created    | timestamp with time zone |           |
| date_range | tstzrange                |           |
+------------+--------------------------+-----------+

I want to get all of the transaction for the last reporting period. Here are the two queries that yield the same results, but one does a seq scan and the other can use the idx_account_transaction_created index.

explain select count(*) from account_transaction where created <@ (select date_range from reporting_period order by created desc limit 1);
+----------------------------------------------------------------------------------------+
|                                       QUERY PLAN                                       |
+----------------------------------------------------------------------------------------+
| Aggregate  (cost=4214.81..4214.82 rows=1 width=0)                                      |
|   InitPlan 1 (returns $0)                                                              |
|     ->  Limit  (cost=13.20..13.20 rows=1 width=40)                                     |
|           ->  Sort  (cost=13.20..13.60 rows=800 width=40)                              |
|                 Sort Key: reporting_period.created                                     |
|                 ->  Seq Scan on reporting_period  (cost=0.00..12.40 rows=800 width=40) |
|   ->  Seq Scan on account_transaction  (cost=0.00..4200.81 rows=1602 width=0)          |
|         Filter: (created <@ $0)                                                        |
+----------------------------------------------------------------------------------------+
(8 rows)

explain select count(*) from account_transaction where created >= '2014-06-01' and created <= '2014-06-30 23:59:59.999999';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                            QUERY PLAN                                                                            |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=2640.54..2640.54 rows=1 width=0)                                                                                                                |
|   ->  Index Only Scan using idx_account_transaction_created on account_transaction  (cost=0.08..2605.77 rows=69535 width=0)                                      |
|         Index Cond: ((created >= '2014-06-01 00:00:00+00'::timestamp with time zone) AND (created <= '2014-06-30 23:59:59.999999+00'::timestamp with time zone)) |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(3 rows)

I like the first query better because it seems easier to read and understand and it's only 1 round trip. The second one is more efficient because it uses the index on the created field, but will mean that the application will need to go out and get the last reporting period and get the lower and upper boundaries on the date_range field (Not the worst thing in the world). And I guess I could always write it as a function or a view. But, I'm just a little surprised that PostgreSQL didn't figure out it could use an index. Is there something I'm missing here? Is there anyway to make the first query use the index?

I'm using PostgreSQL 9.3

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
David S
  • 12,967
  • 12
  • 55
  • 93

1 Answers1

1

The operator <@ requires a GIN or GiST index to work with. Doesn't work with a plain B-tree index.
Details in the manual here.
Related answers:

Alternative

For your use case a B-tree index is probably more efficient. This should allow Postgres to use it:

SELECT count(*) AS ct
FROM  (
   SELECT lower(date_range) AS ts_from, upper(date_range) AS ts_to
   FROM   reporting_period
   ORDER  BY created DESC
   LIMIT  1
   ) r
JOIN   account_transaction a ON a.created >= r.ts_from
                            AND a.created <  r.ts_to
;

Assuming that all your tstzrange values have including lower bounds and excluding upper bounds (the suggested default). To enforce that I suggest a CHECK constraint in your table reporting_period:

CHECK (lower_inc(date_range) AND NOT upper_inc(date_range))

Else you need more elaborate conditions. Related answer:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228