In SQL table person_rate we have stored a rate float value which changes in time. Columns:
id (serial, PK)
person_id (int)
date_from (date)
rate (float)
(person_id, date_from)
is unique, because at most one change per day is allowed (maybe it could be a PK, but it's not important)
Rate value for given person_id is valid at time domain from date_from
date to next record with succesive date_from
, or to the infty future if there is no such record. Example:
id person_id date_from rate
101 1 2011-01-01 100.0
145 1 2012-01-01 180.0
193 1 2012-05-01 140.0
Now I need a SELECT query which for each person_id
returns a rate
valid for some given $date. Let say for 2012-03-01 it's 180; for 2012-05-02 it's 140 and so.
Solutions I have tested:
1) condition date_from <= $date
+ use window function rank() OVER (PARTITION BY person_id ORDER BY date_from DESC
+ in superselect WHERE rank = 1
2) similar to 1) but use SELECT DISTINCT ON (person_id)
instead of limiting rank to 1
Both 1) and 2) does not perform well, EXPLAIN show that db needs to sort all records for each person_id
and then limit to 1 first. Probably this type of query can't fully utilise index on date_from
?
IDEA - add a date_to
column, which will be a bit redundant, because the value will be "date_from of succesive record, minus 1 day" (or +infty if no succesive record). But the query then could be with date_from <= $date AND date_to >= $date
- which probably would have a good performance with indexes on date_from and date_to.
But I'm a bit afraid how to manage the data integrity in this case - how to set up constraint that [ date_from .. date_to ] intervals for one person_id shoud not be overlapped?
What is the best solution for postgresql for this type of query? Load is most read, not much writes to the person_rate table. Typical query would internaly need to get a rate for each day in month...
Maybe this SQL query for index/primary key ordinal with the new indexes on pg 9.2 can help somehow?