7

I have a simple table which has a user_birthday field with a type of date (which can be NULL value)

CREATE TABLE users
(
  user_id bigserial NOT NULL,
  user_email text NOT NULL,
  user_password text,
  user_first_name text NOT NULL,
  user_middle_name text,
  user_last_name text NOT NULL,
  user_birthday date,
  CONSTRAINT pk_users PRIMARY KEY (user_id)
)

There's an index (btree) defined on that field, with the rule of NOT user_birthday IS NULL.

CREATE INDEX ix_users_birthday
  ON users
  USING btree
  (user_birthday)
  WHERE NOT user_birthday IS NULL;

Trying to follow up on another idea, I've added the extension btree_gist and created the following index:

CREATE INDEX ix_users_birthday_gist
  ON glances.users
  USING gist
  (user_birthday)
  WHERE NOT user_birthday IS NULL;

But it had no affect either, as from what I could read it is not used for range checking.

The PostgreSQL version is 9.3.4.0 (22) Postgres.app and issue also exists in 9.3.3.0 (21) Postgres.app

I've been intrigued by the following queries:

Query #1:

EXPLAIN ANALYZE SELECT *
FROM users
WHERE user_birthday <@ daterange('[1978-07-15,1983-03-01)')

Query #2:

EXPLAIN ANALYZE SELECT *
FROM users
WHERE user_birthday BETWEEN '1978-07-15'::date AND '1983-03-01'::date

which, at first glance both should have the same execution plan, but for some reason, here are the results:

Query #1:

"Seq Scan on users  (cost=0.00..52314.25 rows=11101 width=241) (actual
time=0.014..478.983 rows=208886 loops=1)"
"  Filter: (user_birthday <@ '[1978-07-15,1983-03-01)'::daterange)"
"  Rows Removed by Filter: 901214"
"Total runtime: 489.584 ms"

Query #2:

"Bitmap Heap Scan on users  (cost=4468.01..46060.53 rows=210301 width=241)
(actual time=57.104..489.785 rows=209019 loops=1)"
"  Recheck Cond: ((user_birthday >= '1978-07-15'::date) AND (user_birthday
<= '1983-03-01'::date))"
"  Rows Removed by Index Recheck: 611375"
"  ->  Bitmap Index Scan on ix_users_birthday  (cost=0.00..4415.44
rows=210301 width=0) (actual time=54.621..54.621 rows=209019 loops=1)"
"        Index Cond: ((user_birthday >= '1978-07-15'::date) AND
(user_birthday <= '1983-03-01'::date))"
"Total runtime: 500.983 ms"

As you can see, the <@ daterange is not utilizing the existing index, while BETWEEN does.

Important to note that the actual use case for this rule is in a more complex query, which doesn't result in the Recheck Cond and Bitmap Heap scan. In the application complex query, the difference between the two methods (with 1.2 million records) is massive: Query #1 at 415ms Query #2 at 84ms.

Is this a bug with daterange? Am I doing something wrong? or datarange <@ is performing as designed?

There's also a discussion in the pgsql-bugs mailing list

Shahar Hadas
  • 2,641
  • 1
  • 27
  • 29

1 Answers1

5

BETWEEN includes upper and lower border. Your condition

WHERE user_birthday BETWEEN '1978-07-15'::date AND '1983-03-01'::date

matches

WHERE user_birthday <@ daterange('[1978-07-15,1983-03-01]')

I see you mention a btree index. For that use simple comparison operators.

Detailed manual page on which index is good for which operators.

The range type operators <@ or @> would work with GiST indexes.
Example:
Perform this hours of operation query in PostgreSQL

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    Tried that already. postgres automatically converts [] daterange to [) daterange. No affect - still Seq Scan. – Shahar Hadas Apr 02 '14 at 23:08
  • GIST is used when you have an index on a daterange field. here I'm just using a daterange to filter. – Shahar Hadas Apr 02 '14 at 23:13
  • @Sash: Combine `<@` with a gist index on a range type - or a functional index. For the btree index use simple comparison operators. – Erwin Brandstetter Apr 02 '14 at 23:14
  • I don't have a range type. Added the test table creation. I'm trying to create the complex gist index as you suggested, but it seems that in order to do so, I need to create an index which takes the user birthday as the two values of the range. – Shahar Hadas Apr 02 '14 at 23:36
  • Well, I'd rather suggest to use simple comparison operators with the btree index. :) A GiST index might be a good idea for a nearest neighbor search, but your example works better with simple comparison operators and a btree index. – Erwin Brandstetter Apr 02 '14 at 23:42
  • I've just reported this as a possible bug to postgres. I'll update this question once I hear back. As for the index, after some more googling, it seems the solution is btree_gist extension, but I'm still looking into this. – Shahar Hadas Apr 02 '14 at 23:52
  • @Sash: It's not a bug, the operators are just not defined. Why wouldn't you want to use the cheaper simple operators? There is nothing to gain for you here ... – Erwin Brandstetter Apr 02 '14 at 23:54
  • Because I already have the range to filter by in another table as daterange. so now I need to use upper and lower to extract those values with COALESCE with infinity and -infinity. so its ugly ... and I don't yet if it adds slowness or not. I'm working on those changes right now. – Shahar Hadas Apr 03 '14 at 00:04
  • regarding the operators - am I missing something? according to this they do exist: http://www.postgresql.org/docs/9.3/static/functions-range.html - look at the timestamp example. – Shahar Hadas Apr 03 '14 at 00:08
  • The operators exist - else your query would have failed altogether. What's missing are the *index operator classes*. btree indexes cannot handle `<@` `@>` in standard Postgres. Consider the link to the manual I added last. It explains more than my first link. – Erwin Brandstetter Apr 03 '14 at 00:21
  • moving forward on your suggestion, I've added btree_gist, and then added the following gist index, but still it doesn't uses any index – Shahar Hadas Apr 03 '14 at 00:45
  • @Sash: [`btree_gist`](http://www.postgresql.org/docs/current/interactive/btree-gist.html) allows you to include the listed basic types in a GiST index, not the other way round (implement additional operators for btree indexes). That might be a lot more complicated. I don't know of a module providing what you are after. If you should find something, be sure to post an answer here. – Erwin Brandstetter Apr 03 '14 at 00:54