-1

I read a lot of good answers (here, here, here) about finding gaps, but I still can't figure out how to find gaps with a minimal predefined size.

In my case gaps are entries with no name order by HE.

I also need to find gaps starting at the beginning of the table like in the example.

Anyone can help with a nice and clean SQL statement which can be altered uncomplicated to get predefined minimal gap sizes?

Example with expected output:

+-----------+----+      +----------------+      +----------------+      +----------------+
| name      | HE |      |   GAPS >= 1    |      |   GAPS >= 2    |      |   GAPS >= 3    |
+-----------+----+      +-----------+----+      +-----------+----+      +-----------+----+
|           |  1 |      | name      | HE |      | name      | HE |      | name      | HE |
| JohnDoe01 |  2 |      +-----------+----+      +-----------+----+      +-----------+----+
| JohnDoe02 |  3 |      |           |  1 |      |           |  4 |      |           | 12 |
|           |  4 |      |           |  4 |      |           |  5 |      |           | 13 |
|           |  5 |      |           |  5 |      |           |  9 |      |           | 14 |
| JohnDoe03 |  6 |      |           |  9 |      |           | 10 |      +-----------+----+
| JohnDoe04 |  7 |      |           | 10 |      |           | 12 |
| JohnDoe05 |  8 |      |           | 12 |      |           | 13 |
|           |  9 |      |           | 13 |      |           | 14 |
|           | 10 |      |           | 14 |      +-----------+----+
| JohnDoe06 | 11 |      +-----------+----+
|           | 12 |
|           | 13 |      
|           | 14 |      
| JohnDoe07 | 15 |      
+-----------+----+
dnace
  • 23
  • 2

1 Answers1

1

You can identify the gaps and the start and stops. To identify the gaps, count the number of non-gaps and aggregate:

select min(he), max(he), count(*) as size
from (select t.*, count(name) over (order by he) as grp
      from t
     ) t
where name is null
group by grp;

You can then filter using having for gaps of a certain size, say 2:

having count(*) >= 2

for instance.

This summarizes the gaps, with one per row. That actually seems more useful to me than a separate row for each row.

EDIT:

If you actually wanted the original rows, you could do:

select t.*
from (select t.*,
             max(he) filter (where name is not null) over (order by he) as prev_he,
             min(he) filter (where name is not null) over (order by he desc) as next_he,
             max(he) over () as max_he
      from t
     ) t
where name is null and
      (max(next_he, max_he + 1) - coalesce(prev_he, 0) - 1) >= 2;

EDIT II:

In older versions of MySQL/MariaDB, you can use variables:

select min(he), max(he), count(*) as size
from (select t.*,
             (@grp := @grp + (name is not null)) as grp
      from (select t.* from t order by he) t cross join
           (select @grp := 0) params
     ) t
where name is null
group by grp;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you, your explanation helped a lot and I really like the solution, but I mixed up our database setup. We are using mariadb-5.5 (its based on MySQL 5.5) and it looks like it does not support window function. """SELECT t.*, COUNT(name) OVER (ORDER BY HE) as grp FROM Table t""" says SQL syntax error. Can you think of any solution not using window function? – dnace Mar 24 '20 at 21:33