1

I have a table in Postgres which looks like below:

CREATE TABLE my_features
(
  id uuid NOT NULL,
  feature_id uuid NOT NULL,
  begin_time timestamptz NOT NULL,
  duration integer NOT NULL
)

For each feature_id there may be multiple rows with time ranges specified by begin_time .. (begin_time + duration). duration is in milliseconds. They may overlap. I'm looking for a fast way to find all feature_ids that have any overlaps.

I have referred to this - Query Overlapping time range which is similar but works on a fixed time end time.

I have tried the below query but it is throwing an error.

Query:

select c1.*
from my_features c1
where exists (select 1
              from my_features c2
              where tsrange(c2.begin_time, c2.begin_time + '30 minutes'::INTERVAL, '[]') && tsrange(c1.begin_time, c1.begin_time + '30 minutes'::INTERVAL, '[]')
                and c2.feature_id = c1.feature_id
                and c2.id <> c1.id);

Error:

ERROR: function tsrange(timestamp with time zone, timestamp with time zone, unknown) does not exist LINE 5: where tsrange(c2.begin_time, c2.begin_time...

I have used a default time interval here because I did not understand how to convert the time into minutes and substitute it with 'n minutes'.

Vaulstein
  • 20,055
  • 8
  • 52
  • 73

4 Answers4

2

If you need a solution faster than O(n²), then you can use constraints on ranges with btree_gist extension, possibly on a temporary table:

CREATE TEMPORARY TABLE my_features_ranges (
  id uuid NOT NULL,
  feature_id uuid NOT NULL,
  range tstzrange NOT NULL,
  EXCLUDE USING GIST (feature_id WITH =, range WITH &&)
);
INSERT INTO my_features_ranges (id, feature_id, range)
select id, feature_id, tstzrange(begin_time, begin_time+duration*'1ms'::interval)
from my_features
on conflict do nothing;

select id from my_features except select id from my_features_ranges;
Tometzky
  • 22,573
  • 5
  • 59
  • 73
1

Using OVERLAPS predicate:

SELECT * -- DISTINCT f1.*
FROM my_features f1
JOIN my_features f2
  ON f1.feature_id = f2.feature_id
 AND f1.id <> f2.id
 AND (f1.begin_time::date, f1.begin_time::date + '30 minutes'::INTERVAL)
    OVERLAPS (f2.begin_time::date, f2.begin_time::date + '30 minutes'::INTERVAL);

db<>fiddle demo

Tometzky
  • 22,573
  • 5
  • 59
  • 73
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
0

The problem was, I was using tsrange on a column with timezone and for timestamp with timezone, there exist another function called tstzrange

Below worked for me:

EDIT: Added changes suggested by @a_horse_with_no_name

select c1.*
from my_features c1
where exists (select 1
              from my_features c2
              where tstzrange(c2.begin_time, c2.begin_time + make_interval(secs => c2.duration / 1000), '[]') && tstzrange(c1.begin_time, c1.begin_time + make_interval(secs => c1.duration / 1000), '[]')
                and c2.feature_id = c1.feature_id
                and c2.id <> c1.id);

However, the part of calculating interval dynamically is still pending

Vaulstein
  • 20,055
  • 8
  • 52
  • 73
0

Or try this

select c1.*
from jak.my_features c1
where exists (select 1
              from jak.my_features c2
              where tsrange(c2.begin_time::date, c2.begin_time::date + '30 minutes'::INTERVAL, '[]') && tsrange(c1.begin_time::date, c1.begin_time::date + '30 minutes'::INTERVAL, '[]') and 
                c2.feature_id = c1.feature_id
                and c2.id <> c1.id);
WiatroBosy
  • 1,076
  • 1
  • 6
  • 17