1

Table Customer

ID Action Date Time
10 Deal 26.04.2023 15:00
10 Meet 25.04.2023 14:15
10 Call 24.04.2023 13:00
10 Wait 23.04.2023 12:00
201 Wait 15.04.2023 14:30
201 Wait 15.04.2023 13:30
201 Call 15.04.2023 12:30
201 Call 15.04.2023 11:30
201 Wait 15.04.2023 10:30
3020 Wait 6.10.2021 8:00
3020 Wait 5.10.2021 8:00
3020 Call 4.10.2021 8:00
3020 Wait 2.10.2021 8:00
3020 Wait 1.10.2021 8:00
3020 Wait 1.10.2021 8:00

I need to find out a wait-streak for each ID.

Definition for wait-streak is:

  1. Starting from ID's newest row, if Action is Wait then wait-streak = 1
  2. If next chronological row with same ID has Action = Wait, wait-streak is added by 1
  3. All rows for ID are chronologically looped and wait-streak is added by 1 if streak continues
  4. If there's are multiple streaks of Wait, biggest must be found and value of newest streak must be preserved

In addition to Date, also Time must be taken into account. Is it better to combine these into a new date/time column or is there a simple way to use existing columns?

After or during the calculation, results should be updated into a new table. Longest found streak into column BiggestWaitStreak, latest streak into column NewestWaitStreak and if newest Action is Wait then column NewestIsWait must be set to 1.

After Update new statistics table should look like:

Table ContactStats

ID BiggestWaitStreak NewestWaitStreak NewestIsWait
10 1 1 0
201 2 2 1
3020 3 2 1

I guess best way to achieve this would be utilizing CTE or temporary tables. Personally I prefer CTE. I would be happy to add my work in progress but unfortunately there is none. I don't know where to start.

Any help a highly appreciated.

Kaptah
  • 9,785
  • 4
  • 22
  • 19
  • You don't show any example of multiple streaks ie series of WAIT followed by CALL then more wait - can that happen? If so this would be solved with a gaps & islands solution. – Stu Apr 26 '23 at 19:58
  • Actually this just popped into my mind and I am currently editing the question. – Kaptah Apr 26 '23 at 20:12
  • 2
    What did you remove the table formatting for? I've put it back in. Why does ID 3020 get a value of 2 shouldn't it be 1 as the longest Wait streak is only one row? – Charlieface Apr 26 '23 at 22:34
  • why for `3020`, `BiggestWaitStreak` is `3`? and `NewestIsWait` is `1`? – Squirrel Apr 27 '23 at 03:09
  • That was an error on creating sample data, sorry. Corrected and edited. – Kaptah Apr 27 '23 at 07:43
  • Charlieface, thanks for formatting, looks much nicer. I cannot use my real data atm so I had to manually write a sample and I found it easier to do in external editor. I have to learn to use platform's table editor. – Kaptah Apr 27 '23 at 07:46

1 Answers1

1

This is a classic gaps-and-islands problem

The classic solution to find islands of data is to use LAG and/or LEAD to find the beginning/end of islands, then use a windowed COUNT to create an ID per section.

Then it is simply a matter of using more window functions prtitioned over that ID. There is a slight tweak that NewestIsWait can be calculated separately using FIRST_VALUE.

WITH PrevValues AS (
    SELECT *,
      PrevAction = LAG(Action) OVER (PARTITION BY ID ORDER BY Date DESC, Time DESC),
      NewestIsWait = CASE WHEN
          FIRST_VALUE(Action) OVER (PARTITION BY ID ORDER BY Date DESC, Time DESC ROWS UNBOUNDED PRECEDING)
          = 'Wait' THEN 1 ELSE 0 END
    FROM YourTable
),
Grouped AS (
    SELECT *,
      GroupId = COUNT(CASE WHEN Action = 'Wait' AND (PrevAction <> 'Wait' OR PrevAction IS NULL) THEN 1 END)
        OVER (PARTITION BY ID ORDER BY Date DESC, Time DESC ROWS UNBOUNDED PRECEDING)
    FROM PrevValues
),
Counted AS (
    SELECT *,
      StreakLength = COUNT(CASE WHEN Action = 'Wait' THEN 1 END)
        OVER (PARTITION BY ID, GroupId)
    FROM Grouped
)
SELECT
  ID,
  Action,
  Date,
  Time,
  BiggestWaitStreak = MAX(StreakLength) OVER (PARTITION BY ID),
  NewestWaitStreak = MIN(CASE WHEN GroupId = 1 THEN StreakLength END) OVER (PARTITION BY ID),
  NewestIsWait
FROM Counted;

db<>fiddle

I suggest you consider combining the Date and Time columns into one.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Thanks for you answer, looking ingenious. I will try this soon. – Kaptah Apr 28 '23 at 05:45
  • This question has new requirement, UPDATE should be done into a another table. This is smarter in all aspects. – Kaptah Apr 28 '23 at 05:49
  • You probably need a joined update https://stackoverflow.com/questions/982919/sql-update-query-using-joins but I suggest you make a new question for that if you are having problems with it – Charlieface Apr 28 '23 at 08:55