SQL Partition By clause - need third level but no unique value

Question

I started from this post: Detect consecutive dates and continued researching CTE and OVER and PARTITION BY. I have some data in a log that I have been running through programmatically and I am trying to optimize the process. Picture the sample data this way...

EventCode, TechCode, TaskCode, Time
2021003, 42, D, 2021-08-07 13:30
2021003, 42, E, 2021-08-07 13:30
2021003, 42, C, 2021-08-07 13:40
2021004, 43, D, 2021-08-07 13:35
2021004, 43, A, 2021-08-07 13:37
2021004, 43, X, 2021-08-07 13:38
2021004, 43, D, 2021-08-07 13:50
2021004, 43, E, 2021-08-07 13:50
2021004, 43, C, 2021-08-07 14:00

D,A, and E are start codes. D may not always be the first code in the group. It could also be A or E. C and X are stop codes. From this I would generate 3 rows:

Event,Tech,Start,Stop
2021003, 42, 2021-08-07 13:30, 2021-08-07 13:40
2021004, 43, 2021-08-07 13:35, 2021-08-07 13:38
2021004, 43, 2021-08-07 13:50, 2021-08-07 14:00

to represent the start and stop times of the tech on the event. If a tech worked on an event only once then it is easy enough to partition by eventcode, techcode.

;WITH CTE as (
Select EventCode, TechCode, MIN(Time) OVER (PARTITION BY EventCode, TechCode) as Start, 
MAX(Time) OVER (PARTITION BY EventCode, TechCode) as Stop, 
ROW_NUMBER() over (PARTITION BY EventCode, TechCode ORDER BY EventCode, TechCode, Time) as rownum
)
Select EventCode, TechCode, Start, Stop
from CTE
where rownum = 1

Can anyone see anyway to get one more level so that I can account for the 2 groups for tech 43 on event 2021004? Or is this one that I will just have to keep using my program to loop through the lines looking for stop and start codes to build each row?

score 0 · Accepted Answer · answered Aug 11 '21 at 23:26

This is a type of gaps-and-islands problem.

There are a number of solutions. Here is one:

We identify the starting points of each group by looking at TaskCode for the current and previous row
We use a windowed COUNT to number the groups
We then simply group by EventCode TechCode and the new group number

WITH StartPoints AS (
    SELECT *,
        IsStart = CASE WHEN TaskCode IN ('D','A','E')
             AND LAG(TaskCode, 1, 'C') OVER (PARTITION BY EventCode, TechCode
                 ORDER BY Time) IN ('C', 'X') THEN 1 END
    FROM my_table t
),
Groupings AS (
    SELECT *,
        GroupId = COUNT(IsStart) OVER (PARTITION BY EventCode, TechCode
                  ORDER BY Time ROWS UNBOUNDED PRECEDING)
    FROM StartPoints t
)
SELECT
    EventCode,
    TechCode,
    MIN(Time) Start,
    MAX(Time) Stop
FROM Groupings t
GROUP BY
    EventCode, TechCode, GroupId;

db<>fiddle

SQL Partition By clause - need third level but no unique value

1 Answers1