Aggregate continuous ranges of dates

Question

Let's say you have the following PostgreSQL sparse table listing reservation dates:

CREATE TABLE reserved_dates (
    reserved_days_id    SERIAL  NOT NULL,
    reserved_date       DATE    NOT NULL
);

INSERT INTO reserved_dates (reserved_date) VALUES
    ('2014-10-11'),
    ('2014-10-12'),
    ('2014-10-13'),
    -- gap
    ('2014-10-15'),
    ('2014-10-16'),
    -- gap
    ('2014-10-18'),
    -- gap
    ('2014-10-20'),
    ('2014-10-21');

How do you aggregate those dates into continuous date ranges (ranges without gaps)? Such as:

 start_date | end_date
------------+------------
 2014-10-11 | 2014-10-13
 2014-10-15 | 2014-10-16
 2014-10-18 | 2014-10-18
 2014-10-20 | 2014-10-21

This is what I came up with so far, but I can only get start_date this way:

WITH reserved_date_ranges AS (
    SELECT reserved_date,
           reserved_date
           - LAG(reserved_date) OVER (ORDER BY reserved_date) AS difference
    FROM reserved_dates
)
SELECT *
FROM reserved_date_ranges
WHERE difference > 1 OR difference IS NULL;

Keyword is *calendar table* , in your case:: generate_series() left join YourTable, basically. — wildplasser, Oct 20 '14 at 23:50
Alright, I would fill in the gaps with the in-between dates that way. How do I get the date ranges from that? — Linas Valiukas, Oct 21 '14 at 00:15

Erwin Brandstetter · Accepted Answer · 2022-06-20T00:10:26.517

13

SELECT min(reserved_date) AS start_date
     , max(reserved_date) AS end_date
FROM  (
   SELECT reserved_date
        , reserved_date - row_number() OVER (ORDER BY reserved_date)::int AS grp
   FROM   reserved_dates
   ) sub 
GROUP  BY grp
ORDER  BY grp;

Compute gap-less serial numbers in chronological order with the window function row_number(). Duplicate dates are not allowed. (I added a UNIQUE constraint in the fiddle.)

If your reserved_days_id happens to be gap-less and in chronological order, you can use that directly instead. But that's typically not the case.
Subtract that from reserved_date in each row (after converting to integer). Consecutive days end up with the same date value grp - which has no other purpose or meaning than to form groups.
Aggregate in the outer query. Voilá.

db<>fiddle here
_{Old sqlfiddle}

Similar cases:

edited Jun 20 '22 at 00:10

answered Oct 21 '14 at 01:46

Erwin Brandstetter

605,456
145
1,078
1,228

This A seems hard to apply in general case for time series with a set resolution, e.g. 2 days or every 10 minutes. Timestamps would need to be cast to real values and float point operations aren't precise. Are there alternatives? – mlt Mar 09 '18 at 20:55
1

I take my words back. This solution works with timestamp columns akin to `extract(epoch from timestamp) - extract(epoch from interval '1day')*row_number() over (order by timestamp)` – mlt Mar 09 '18 at 21:18
After all this years, I realized it is not duplicates-proof https://www.db-fiddle.com/f/31UdD6udUZqwQk43pRbGKZ/1 – mlt Jun 20 '20 at 05:46
2

Use `dense_rank` instead of `row_number` if you happen to have duplicates. – mlt Jun 20 '20 at 23:23

Aggregate continuous ranges of dates

1 Answers1