This is a combination of self joins and anti joins.
- Self join: A table is joined to itself (here to the row with the ID decremented or incremented by 1).
- Anti join: A left outer join and then a
WHERE
clause to only keep outer-joined rows, thus keeping all rows from the left table that have no match. This is a rather common technique used on young DBMS where joins are already quite optimized and the more straight-forward methods NOT EXISTS
and NOT IN
are not.
What this query does is:
- Find IDs that have no direct predecessor. E.g. for IDs 1, 2, 4, 5, 6, 8, 10, 12, 23, 24 we will find 1, 4, 8, 10, 12, and 23.
- Find IDs that have no direct follower. E.g. for IDs 1, 2, 4, 5, 6, 8, 10, 12, 23, 24 we will find 2, 6, 8, 10, 12, and 24.
- Join the former with the latter where the former <= the latter: 1-2, 1-6, 1-8, ..., 8-8, 8-10, ..., 23-24.
- Get the minimum end ID per start ID: 1-2, 4-6, 8-8, 10-10, 12-12, 23-24.
The query finds thus number ranges. 1, 2, 4, 5, 6, 8, 10, 12, 23, 24 = 1-2, 4-6, 8, 10, 12, 23-24.
This kind of task is called a gaps and island problem. Most often these are solved with window functions:
select min(log(id), max(log_id)
from
(
select
log_id,
log_id - row_number() over (order by log_id) as grp
from logs
) grouped
group by grp
order by grp;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=3eaeb881c8e5498a02fa0ff34f4cffc3