4

I followed this blogpost: https://info.crunchydata.com/blog/range-types-recursion-how-to-search-availability-with-postgresql

CREATE TABLE travels (
    id serial PRIMARY KEY,
    travel_dates daterange NOT NULL,
    EXCLUDE USING spgist (travel_dates WITH &&)
);

and found this function to be buggy when I've inserted rows with duration back to back

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(COALESCE(a.left @> b.left, FALSE))
    UNION
    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(COALESCE(a.right @> b.right, FALSE))
) a
$$ LANGUAGE SQL STABLE;

INSERT INTO travels (travel_dates)
VALUES
    (daterange('2018-03-02', '2018-03-02', '[]')),
    (daterange('2018-03-06', '2018-03-09', '[]')),
    (daterange('2018-03-11', '2018-03-12', '[]')),
    (daterange('2018-03-16', '2018-03-17', '[]')),
    (daterange('2018-03-25', '2018-03-27', '[]'));

This works as expected at this point.

SELECT *
FROM travels_get_available_dates(daterange('2018-03-01', '2018-04-01'))
ORDER BY available_dates;
available_dates
-------------------------
[2018-03-01,2018-03-02)
[2018-03-03,2018-03-06)
[2018-03-10,2018-03-11)
[2018-03-13,2018-03-16)
[2018-03-18,2018-03-25)
[2018-03-28,2018-04-01)

But when this row is added:

INSERT INTO travels (travel_dates)
VALUES
(daterange('2018-03-03', '2018-03-05', '[]'));

And re-run

SELECT *
FROM travels_get_available_dates(daterange('2018-03-01', '2018-04-01'))
ORDER BY available_dates;

I get

available_dates
-------------------------
empty
mike james
  • 8,840
  • 3
  • 18
  • 21

4 Answers4

3

I've added a comment on the original blog post as to where I think the error is arising from, that is, in the way empty ranges are handled.

When the date ranges are consecutive, or rather adjacent, it results in 'empty' ranges in either, or even both the 'left' and 'right' columns. Now, after the recursive CTE completes, (and supposing that the empty ranges are in the 'left' column), in the 'LEFT OUTER JOIN ... ON ...' clause, a free and valid travel_date gets paired with an 'empty' range from B.left range since A.left <> 'empty' && A.left @> 'empty' since all ranges contain the empty range trivially. Ideally, it should be paired with a NULL since this is a left outer join for it to be included in the final result set but the 'empty' kinda got in the way. The 'empty' then pops up again in the 'GROUP BY ... HAVING ...' clause where a.left @> 'empty' evaluates to true and it's negated hence all valid travel dates are discared resulting in an empty table. My solution is as follows, have the 'emptys' as NULL, and discard any daterange that's in the 'center':

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(coalesce(a.left @> case when isempty(b.left) then null else b.left end, FALSE))

    UNION

    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(coalesce(a.right @> case when isempty(b.right) then null else b.right end, false))

    EXCEPT

    SELECT a.center AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.center <> b.center AND
        a.center @> b.center
    GROUP BY a.center
    HAVING NOT bool_or(COALESCE(a.center @> b.center, FALSE))
) a
WHERE NOT isempty(a.available_dates)
$$ LANGUAGE SQL STABLE;
nagamocha
  • 31
  • 2
0

I think that you should take another approach:

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(
  available_dates daterange
)
AS $$
  WITH RECURSIVE calendar(available_dates) AS
  (
    SELECT 
      CASE 
        WHEN $1 @> travel_dates THEN unnest(array[
          daterange(lower($1),lower(travel_dates)),
          daterange(upper(travel_dates),upper($1)) 
        ])
        WHEN lower($1) < lower(travel_dates) THEN daterange(lower($1),lower(travel_dates)) 
        WHEN upper($1) > upper(travel_dates) THEN daterange(upper(travel_dates),upper($1)) 
      END
    FROM travels 
      WHERE $1 && travel_dates AND NOT travel_dates @> $1
    UNION
    SELECT 
      CASE 
        WHEN available_dates @> travel_dates THEN unnest(array[
          daterange(lower(available_dates),lower(travel_dates)), 
          daterange(upper(travel_dates),upper(available_dates)) 
        ])
        WHEN lower(available_dates) < lower(travel_dates) THEN daterange(lower(available_dates),lower(travel_dates)) 
        WHEN upper(available_dates) > upper(travel_dates) THEN daterange(upper(travel_dates),upper(available_dates)) 
      END
    FROM travels 
      JOIN calendar ON available_dates && travel_dates AND NOT travel_dates @> available_dates
  )

  SELECT $1 AS available_dates 
    WHERE NOT EXISTS(SELECT 1 FROM travels WHERE travel_dates <@ $1)    
  UNION
  SELECT * FROM calendar
    WHERE $1 <> available_dates AND 'empty' <> available_dates
      AND NOT EXISTS(SELECT 1 FROM travels WHERE available_dates && travel_dates)
$$ LANGUAGE SQL STABLE;

We have to recursively split the given range on left and right segments and then get only those which are not occupied.

IVO GELOV
  • 13,496
  • 1
  • 17
  • 26
  • This functions throws an error: "ERROR: set-returning functions are not allowed in CASE LINE 10: WHEN $1 @> travel_dates THEN unnest(array[ ^ – jkatz05 Feb 18 '20 at 13:55
  • I tested it on SQLfiddle with PostgreSQL 9.6 - http://sqlfiddle.com/#!17/0ee63/1 – IVO GELOV Feb 18 '20 at 19:15
  • thanks for answer I can see it working on sqlfiddle but not on my postgres instance locally. Same error as @jkatz05 - ```NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to table travels drop cascades to function travels_get_available_dates(daterange) ERROR: set-returning functions are not allowed in CASE LINE 31: WHEN $1 @> travel_dates THEN unnest(array[ ^ HINT: You might be able to move the set-returning function into a LATERAL FROM item. SQL state: 0A000 Character: 876``` – mike james Feb 19 '20 at 00:29
  • We need both the left and right parts of the split. Perhaps you can modify the query to use 2 CTEs - one for the left parts and one for the right parts and then take the union - but I do not think this will produce a correct output. – IVO GELOV Feb 19 '20 at 10:22
0

I had originally forgot the clause for the "center" area. Here it is below:

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(COALESCE(a.left @> b.left, FALSE))
    UNION
    SELECT a.center AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.center <> b.center AND
        a.center @> b.center
    GROUP BY a.center
    HAVING NOT bool_or(COALESCE(a.center @> b.center, FALSE))
    UNION
    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(COALESCE(a.right @> b.right, FALSE))
) a
WHERE NOT isempty(a.available_dates)
$$ LANGUAGE SQL STABLE;
jkatz05
  • 81
  • 1
  • Thanks for the answer I cannot see this working though it seems to miss availability between 3rd - 6th when just inserting ```INSERT INTO travels (travel_dates) VALUES (daterange('2018-03-02', '2018-03-02', '[]')), (daterange('2018-03-06', '2018-03-09', '[]')), (daterange('2018-03-11', '2018-03-12', '[]')), (daterange('2018-03-16', '2018-03-17', '[]')), (daterange('2018-03-18', '2018-03-19', '[]')), (daterange('2018-03-25', '2018-03-27', '[]'));``` – mike james Feb 19 '20 at 00:28
0

I couldn't get the recursive functions to work -- I would just get an infinite loop. You don't need recusion to solve this problem, however! You can use a PostgreSQL WINDOW instead.

https://www.postgresql.org/docs/current/tutorial-window.html

Given the original:

CREATE TABLE travels (
    id serial PRIMARY KEY,
    travel_dates daterange NOT NULL,
    EXCLUDE USING spgist (travel_dates WITH &&)
);

and inserting the following values:

INSERT INTO travels (travel_dates)
VALUES
    (daterange('2018-03-02', '2018-03-02', '[]')),
    (daterange('2018-03-06', '2018-03-09', '[]')),
    (daterange('2018-03-11', '2018-03-12', '[]')),
    (daterange('2018-03-16', '2018-03-17', '[]')),
    (daterange('2018-03-25', '2018-03-27', '[]'));

The following SQL will find ALL available dates (I've thrown in the number of consecutive available dates because that was something I needed):

SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
       UPPER(travel_dates) AS available_start
  FROM travels
WINDOW w AS (ORDER BY travel_dates ASC);

Note available_count is null to indicate infinite availability at that point (last date in the travels table). You could handle the null in a different way if needed for your application; additionally if you wanted to restrict this to check availability between two given dates, you can add a WHERE clause thusly (limiting to between 2018-03-01 and 2018-03-15 for instance):

SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
       UPPER(travel_dates) AS available_start
  FROM travels
 WHERE '[2018-03-01,2018-03-15]'::daterange @> travel_dates
WINDOW w AS (ORDER BY travel_dates ASC);

In this case, you'd want to ignore the null value; I haven't figured out the cleanest way to do that, but you could make it a subquery ...


SELECT sq.available_count, sq.available_start
  FROM (
    SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
           UPPER(travel_dates) AS available_start
      FROM travels
     WHERE '[2018-03-01,2018-03-15]'::daterange @> travel_dates
    WINDOW w AS (ORDER BY travel_dates ASC)
  ) AS sq
 WHERE sq.available_count is not null;

I believe there's a slicker way to do that ... but I don't know what it is :) Note that you can use 'today' in your WHERE clause if you want to include the present day in your range and get availability between now and the first day in your 'travels' table.

I hope this helps someone else; this was probably about two days of work to sort this out!

user914330
  • 78
  • 5