Given a table of half-open unix timestamp intervals at microsecond precision [start_time_us; stop_time_us):
CREATE TABLE interval(
start_time_us INTEGER NOT NULL,
stop_time_us INTEGER NOT NULL
);
I want to select intervals which intersect a given query interval [:start_time_us, :stop_time_us):
SELECT * FROM interval
WHERE start_time_us < :stop_time_us
AND :start_time_us < stop_time_us
ORDER BY start_time_us;
Trying to make it go fast, I create these covering indices:
CREATE INDEX interval_start_idx
ON interval(start_time_us, stop_time_us);
CREATE INDEX interval_stop_idx
ON interval(stop_time_us, start_time_us);
However, if I EXPLAIN QUERY PLAN
the above query, I get:
QUERY PLAN
`--SEARCH interval USING INDEX interval_start_idx (start_time_us<?)
Which, as I understand it, means that SQLite will iterate rows of the interval_start_idx
index B-tree from the beginning while start_time_us < :stop_time_us
, returning only rows which have :start_time_us < stop_time_us
. Which will, on average, iterate half the B-tree, which is O(N), which is too slow.
On reflection it makes sense, since to make use of the second index and iterate only the required section of the B-tree, SQLite needs to know that start_time_us < stop_time_us
for every row. And even if it did know this (e.g. from a CHECK
constraint), it's not smart enough to figure such an optimization out.
I can translate from one index to another in a subquery:
SELECT * FROM interval
WHERE
(
SELECT start_time_us
FROM interval
WHERE :start_time_us < stop_time_us
ORDER BY stop_time_us
LIMIT 1
) <= start_time_us
AND start_time_us < :stop_time_us
ORDER BY start_time_us;
Which produces:
QUERY PLAN
|--SEARCH interval USING INDEX idx_interval_start (start_time_us>? AND start_time_us<?)
`--SCALAR SUBQUERY 1
`--SEARCH interval USING INDEX idx_interval_stop (stop_time_us>?)
But it's only correct for non-overlapping intervals.
How can I do this in logarithmic time?