I have an issue with removing or tagging overlapping timestamps grouped by certain ID.
Times can overlap in nest and may have same start time or end time.
If second time starts before previous time has ended it will end before or at the same time as previous time. No time differences will go over 12 hours.
Using T-SQL.
Sample data:
ID task_id starttime endtime
11 1 2023-01-10 06:31:00.000 2023-01-10 08:53:00.000
11 1 2023-01-10 08:00:00.000 2023-01-10 08:53:00.000
11 2 2023-01-10 13:14:00.000 2023-01-10 15:15:00.000
11 2 2023-01-10 15:46:00.000 2023-01-10 17:59:00.000
11 2 2023-01-10 18:49:00.000 2023-01-10 18:50:00.000
12 3 2023-01-09 10:10:00.000 2023-01-09 11:10:00.000
12 3 2023-01-09 10:10:00.000 2023-01-09 10:50:00.000
13 4 2023-01-08 20:00:00.000 2023-01-09 03:44:00.000
13 4 2023-01-08 21:00:00.000 2023-01-09 02:00:00.000
14 5 2023-01-01 19:23:00.000 2023-01-01 20:47:00.000
14 5 2023-01-02 03:35:00.000 2023-01-02 06:57:00.000
Desired result:
ID task_id starttime endtime
11 1 2023-01-10 06:31:00.000 2023-01-10 08:53:00.000
11 2 2023-01-10 13:14:00.000 2023-01-10 15:15:00.000
11 2 2023-01-10 15:46:00.000 2023-01-10 17:59:00.000
11 2 2023-01-10 18:49:00.000 2023-01-10 18:50:00.000
12 3 2023-01-09 10:10:00.000 2023-01-09 11:10:00.000
13 4 2023-01-08 20:00:00.000 2023-01-09 03:44:00.000
14 5 2023-01-01 19:23:00.000 2023-01-01 20:47:00.000
14 5 2023-01-02 03:35:00.000 2023-01-02 06:57:00.000
I've tried methods with lead or lag functions but it doesn't seem to play well with edge cases. For example:
case when lead(starttime) over (partition by task_id order by starttime) <> endtime then 1 else 0 end as overlap_tag
Doesn't count the time in ID 11 task_id 2 from 18:49-18:50 as not overlapping and doesn't seem to take into account the day changing.