4

I got a table with some DATETIME ranges, something like

id | start               | end
----------------------------------------------
1  | 2011-12-18 16:00:00 | 2011-12-18 17:00:00
2  | 2011-12-19 08:00:00 | 2011-12-19 10:00:00
3  | 2011-12-19 11:00:00 | 2011-12-19 13:00:00
4  | 2011-12-19 12:00:00 | 2011-12-19 14:00:00
5  | 2011-12-19 13:00:00 | 2011-12-19 15:00:00
6  | 2011-12-19 13:00:00 | 2011-12-19 14:00:00
7  | 2011-12-20 13:00:00 | 2011-12-20 14:00:00

So for day 2011-12-19 the ranges spans like this:

8    9   10   11   12   13   14   15
<-------->
               <-------->
                    <-------->
                         <-------->
                         <---->

The goal is, when inserting new record, to find the max number of overlapping ranges already present: i.e.: when inserting the new range 2011-12-19 12:00:00 - 2011-12-19 15:00:00 i would like to receive 3, because the max number of overlapping ranges is 3, from 13:00 to 14:00.

Since now i managed to have this

select
    count(*) as cnt
from
    mytable as p
where
    ( # check if new renge overlap existings ones
        (@start >= start and @start < end)
        or
        (@end > start and @end <= end)
    )
    or
    ( # check if existing range is included by new one
        start between @start and @end
        and
        end between @start and @end
    )

But this return 4 because it detects all ranges except the first, but is wrong.

I already found

But all these questions are slightly different.

I'm on MysQL 5.7, but upgrading to 8 is possibile if necessary.

fudo
  • 2,254
  • 4
  • 22
  • 44
  • Your sample data and expected out do not match. Below is a DateTime overlap query. – Cetin Basoz Dec 11 '18 at 08:42
  • @CetinBasoz Why example data and expected output do not match? – fudo Dec 11 '18 at 08:46
  • I believe that you have an error in your example: you want to insert a new range for 19-12 right? – Radim Bača Dec 11 '18 at 08:49
  • @RadimBača Yes, i didn't get very clear about this, the new example range would be `2011-12-19 12:00:00 - 2011-12-19 15:00:00` – fudo Dec 11 '18 at 08:52
  • @RadimBača as said at the end of the question i'm currently on MySQL 5.7 but upgrading to 8 is not a big deal if necessary – fudo Dec 11 '18 at 08:56
  • @CetinBasoz Ok, i found i wrote the wrong day for the new range (20 instead of 19) – fudo Dec 11 '18 at 09:08
  • OK your expected output still doesn't match. Check the query below I gave, it returns correct output which should be 4 (there are 4 overlapping ranges, not 3). – Cetin Basoz Dec 11 '18 at 10:52
  • @CetinBasoz I still not getting where do i have 4 ranges overlapping, may you pinpoint the time start/end? – fudo Dec 11 '18 at 10:54
  • All the rows with id 3,4,5 6 overlaps with the range given. – Cetin Basoz Dec 11 '18 at 11:00
  • @CetinBasoz yeah, but 3 doesn't overlap 5 nor 6, so the max number of simultaneously overlapping ranges is not 4 but 3, in the time range that span from 13:00 to 14:00 – fudo Dec 11 '18 at 11:10

2 Answers2

6

This answer is for MySQL 8.0 that contains window functions. The core of the solution will be the following query that finds a number of overlapping intervals for every interesting interval in the data:

select t2.startDt, t2.endDt, count(*) overlaps_count
from
(
    select lag(t1.dt) over (order by t1.dt) startDt, t1.dt endDt
    from
    (
        select startt dt from data
        union
        select endt dt from data
    ) t1
) t2
join data on t2.startDt < data.endt and t2.endDt > data.startt
group by t2.startDt, t2.endDt

DBFiddle DEMO

Once you have this result (let call it Overlap table) then you may easily find the maximum for an input interval as follows

with Overlap as
(
   -- the query above
)
select max(overlaps_count)
from Overlap 
where @start < endDt and @end > startDt
Radim Bača
  • 10,646
  • 1
  • 19
  • 33
  • Yeah, this work correctly, but instead of making another super-query as you said wouldn't it the same to simply add a `where` clause in the original one just to filter the ranges that fall inside the new one? – fudo Dec 11 '18 at 10:52
  • or even better, adding the `where` clause in the most inner `union`ed selects? – fudo Dec 11 '18 at 11:03
  • @fudo I'm not sure about the first interval since it will contain null instead of DateTime. I believe that some null handling is needed even in my solution for the first interval and then you can put the condition into the most nested subquery. Best – Radim Bača Dec 11 '18 at 11:07
  • do you mean in the select where you used `lag()`? yes, this record would have `null` as startDT but endDT would coincide with the start of incoming range, so i think it would be discarded (as information) because concerning a range i'm not interested in – fudo Dec 11 '18 at 11:19
  • @fudo true! so in such case you can move the '@start < endDt and @end > startDt' condition into the `Overlap` query. – Radim Bača Dec 11 '18 at 12:45
  • I just noticed you referenced table alias `t1` in the main query in both `select` ad `group by` statement, you should correct it with alias `t2`; I dindn't noticed it before because I rearrenged your query to suit my specific use case – fudo Dec 12 '18 at 09:22
-1

Considering start and end would never be same:

select
    count(*) as cnt
from
    mytable as p
where
    # check if new renge overlap existings ones
        (@start < end and @end > start);
Cetin Basoz
  • 22,495
  • 3
  • 31
  • 39
  • 1
    I think this will also return 4? Because it will not get the *maximum* number of overlapping ranges, but all overlapping ranges, right? – ChatterOne Dec 11 '18 at 08:46
  • @ChatterOne, what does "maximum number of overlapping ranges" mean? How is it different than the number of overlapping ranges? – Cetin Basoz Dec 11 '18 at 10:59
  • 1
    look at the example, you'll see that the overlapping ranges are `[11-13, 12-14]` and `[12-14, 13-15, 13-14]`. With your query you'll get 4 as a result because if any range overlaps with any other range, that will count. But if you want to know the maximum amount of ranges that overlap with each other, that's a different query. In this case the number is 3 because of the `[12-14, 13-15, 13-14]` values. – ChatterOne Dec 11 '18 at 11:17
  • @ChatterOne, sorry I don't understand what you mean. There are 4 overlapping ranges there. – Cetin Basoz Dec 11 '18 at 11:44
  • 1
    Yes, there are 4, but they're not all overlapping with each other. The range `11-13` is overlapping only with `12-14`. It is not, for example, overlapping with `13-15`. In total, there are two ranges that overlap with each other and another three that overlap with each other. – ChatterOne Dec 11 '18 at 11:55
  • @ChatterOne, appearantly I missed your comment. There are 4 overlapping ranges there. 13-15 is an overlapping one too. It overlaps 12-14 and 13-14. – Cetin Basoz Jun 06 '22 at 11:30