1

I need to calculate the amount of a resource consumed by hour. For example, consider phone call minutes. For the following records, I’d like to generate the following results:

Call_ID   StartDateTime     EndDateTime  
1         3/15/2014 11:25   3/15/2014 14:01  
2         3/15/2014 11:50   3/15/2014 13:10  
3         3/15/2014 12:05   3/15/2014 12:55  
4         3/15/2014 13:04   3/15/2014 15:02  
5         3/15/2014 13:15   3/15/2014 14:22  
6         3/15/2014 14:35   3/15/2014 15:18  
7         3/15/2014 15:10   3/15/2014 15:29  

Date        Hour    CallMinutes  
3/15/2014   11       45  
3/15/2014   12       170  
3/15/2014   13       171  
3/15/2014   14       108  
3/15/2014   15       39  

I have SQL that will do this, but it uses a cursor and I would like to find a faster alternative. Here’s the working code I have now:

/* create & fill the Inputs table */  
CREATE TABLE PhoneCalls(  
    Call_ID         int IDENTITY(1,1) NOT NULL,  
    StartDateTime   datetime NOT NULL,  
    EndDateTime     datetime NOT NULL,  
 CONSTRAINT PK_PhoneCalls PRIMARY KEY CLUSTERED ( Call_ID ASC) ON [PRIMARY]  
) ON [PRIMARY];  

INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 11:25', '3/15/2014 14:01');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 11:50', '3/15/2014 13:10');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 12:05','3/15/2014 12:55');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 13:04','3/15/2014 15:02');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 13:15','3/15/2014 14:22');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 14:35','3/15/2014 15:18');  
INSERT INTO PhoneCalls (StartDateTime, EndDateTime) VALUES ('3/15/2014 15:10','3/15/2014 15:29');  

/* Create the Temp table to hold the intermediate results */
Create Table #HourlyCallTimes (
    Call_ID     int NOT NULL,
    CallDate    date NOT NULL,
    CallHour    smallint NOT NULL,
    CallTime    int NOT NULL);

/* Determine Hourly totals of CallTime using a cursor */
SET NOCOUNT ON;
DECLARE @Call_ID int, @StartDt datetime, @EndDt datetime, @CallLength int, @Hour int, @CallTime int;
declare CallsCursor cursor for
    select Call_ID, StartDateTime, EndDateTime, DATEDIFF(minute, StartDateTime, EndDateTime) as CallLength
    FROM PhoneCalls
open CallsCursor
fetch next from CallsCursor into @Call_ID, @StartDt, @EndDt, @CallLength
while @@FETCH_STATUS = 0
begin
    /* make sure that the call was at least a minute long */
    IF (@CallLength > 0.01)
    BEGIN
        -- Record the call time for the first partial hour
        SET @Hour = DATEPART(Hour, @StartDt)
        SET @CallTime = 60 - DATEPART(Minute, @StartDt)
        if (@CallLength < @CallTime) -- check if this us the last (partial) hour
            SET @CallTime = @CallLength
        INSERT INTO #HourlyCallTimes (Call_ID, CallDate, CallHour, CallTime)
            VALUES (@Call_ID, CAST(@StartDt as DATE), @Hour, @CallTime)

        -- move to the beginning of the next hour
        SET @StartDt = DATEADD(HOUR,@Hour+1,CAST(CAST(@StartDt as DATE) as DateTime))
        SET @CallLength = @CallLength - @CallTime
        While @StartDt < @EndDt
        BEGIN
            SET @Hour = DATEPART(Hour, @StartDt)
            SET @CallTime = 60;
            if (@CallLength < @CallTime)  -- check if this is the last (partial) hour
                SET @CallTime = @CallLength
        INSERT INTO #HourlyCallTimes (Call_ID, CallDate, CallHour, CallTime)
            VALUES (@Call_ID, CAST(@StartDt as DATE), @Hour, @CallTime)
            -- move to the beginning of the next hour
            SET @StartDt = DATEADD(Hour,1,@StartDt)     
            SET @CallLength = @CallLength - @CallTime
        END
    END
    fetch next from CallsCursor into @Call_ID, @StartDt, @EndDt, @CallLength
END
close CallsCursor;
deallocate CallsCursor;

/* here's the final results table */
Create Table HourlyCallTotals (
    CallDate    date NOT NULL,
    CallHour    smallint NOT NULL,
    CallTime    int NOT NULL);

Insert into HourlyCallTotals (CallDate, CallHour, CallTime)
    select CallDate, CallHour, SUM(CallTime) from #HourlyCallTimes group by CallDate, CallHour;

DROP TABLE #HourlyCallTimes;
DROP TABLE PhoneCalls;

SELECT * FROM HourlyCallTotals;
DROP TABLE HourlyCallTotals;

/* expected Results */
/*  CallDate    CallHour    CallTime
    3/15/2014       11          45
    3/15/2014       12          170
    3/15/2014       13          171
    3/15/2014       14          108
    3/15/2014       15          39 */
PravinS
  • 2,640
  • 3
  • 21
  • 25
Scott Duncan
  • 1,041
  • 1
  • 12
  • 22
  • You can use a common table expression (CTE) or numbers table to generate the list of dates and hours, then summarize the resources consumed in each hour. A related answer for summarizing periods into quarter hour windows is [here](http://stackoverflow.com/questions/29523163/how-to-sum-the-activity-time-that-occurred-within-15-minute-intervals-using-over/29527525#29527525). In any event, it isn't [RBAR](https://www.simple-talk.com/sql/t-sql-programming/rbar--row-by-agonizing-row/). Tip: Tag database questions with both the software and version, e.g. `sql-server-2012`, as the features vary. – HABO Mar 16 '16 at 02:04
  • 1
    @HABO Beware though, recursive CTE uses hidden RBAR. [**Here**](http://www.sqlservercentral.com/articles/T-SQL/74118/) is an article written by Jeff Moden himself, the one who coined `RBAR`, for more info – Felix Pamittan Mar 16 '16 at 02:42

2 Answers2

1

You can try this recursive CTE to get the desired result:

;WITH cte_break AS
(
SELECT 
CONVERT(DATE,StartDateTime) AS CallDate, 
DATEPART(HOUR,StartDateTime) AS CallHour, 
CASE WHEN DATEDIFF(MINUTE,StartDateTime, EndDateTime) < 60 AND DATEPART(HOUR, StartDateTime) = DATEPART(HOUR, EndDateTime) THEN DATEDIFF(MINUTE,StartDateTime, EndDateTime) 
ELSE (60 - DATEPART(MINUTE,StartDateTime)) END AS CallTime,
DATEADD(minute, CASE WHEN DATEDIFF(MINUTE,StartDateTime, EndDateTime) < 60 THEN DATEDIFF(MINUTE,StartDateTime, EndDateTime) ELSE (60 - DATEPART(MINUTE,StartDateTime)) END , StartDateTime) AS CalcStartTime,
EndDateTime
FROM dbo.PhoneCalls

UNION ALL
SELECT 
CONVERT(DATE,CalcStartTime) AS CallDate, 
DATEPART(HOUR,CalcStartTime) AS CallHour, 
CASE WHEN DATEDIFF(MINUTE,CalcStartTime, EndDateTime) < 60 AND DATEPART(HOUR, CalcStartTime) = DATEPART(HOUR, cte_break.EndDateTime) THEN DATEDIFF(MINUTE,CalcStartTime, EndDateTime) 
ELSE (60 - DATEPART(MINUTE,CalcStartTime)) END AS CallTime,
DATEADD(minute, CASE WHEN DATEDIFF(MINUTE,CalcStartTime, EndDateTime) < 60 THEN DATEDIFF(MINUTE,CalcStartTime, EndDateTime) ELSE (60 - DATEPART(MINUTE,CalcStartTime)) END , CalcStartTime) AS CalcStartTime,
EndDateTime
FROM cte_break
WHERE CalcStartTime < EndDateTime

)
SELECT CallDate, CallHour, SUM(CallTime) AS CallTime FROM cte_break
GROUP BY CallDate, CallHour
ORDER BY CallDate, CallHour
Amit Sukralia
  • 950
  • 1
  • 5
  • 13
0

To get rid of the cursor/loop: instead of counting the time call by call and hour by hour, you need to list all the possible hours (from min(StartHour) to max(EndHour)) and count the per call per hour time. Think of it as a matrix where the columns are the hours and the rows are calls, just sum up the columns to get the hourly call time.

ID Hour0 Hour1 Hour2 Hour3 ...
1 0 0 4 0
2 0 0 3 0
3 0 1 2 0
...

Now, more counting/calculations need to be done (at worst 23 times more than the loop approach), but the benefit of changing from loop to set-based logic far outweighs the cost.

Procedural vs Set-Based SQL

Here is a partial solution (can not deal with different dates):

DECLARE @maxHour INT, @minHour INT, @disHours INT

SELECT  @minHour = MIN(DATEPART(HOUR, StartDateTime)),
    @maxHour = MAX(DATEPART(HOUR, EndDateTime)),
    @disHours = @maxHour - @minHour + 1
FROM    PhoneCalls

-- Filling out a range from @minHour to @maxHour
SELECT TOP (@disHours) @minHour -1 + ROW_NUMBER() OVER (ORDER BY [object_id]) AS CallHour
INTO    #hours
FROM    sys.all_objects 
ORDER BY CallHour

-- Count the per call per hour call time
SELECT  CallHour,
    CASE
        WHEN DATEPART(HOUR, StartDateTime) < CallHour AND DATEPART(HOUR, EndDateTime) > CallHour THEN 60
        WHEN DATEPART(HOUR, StartDateTime) = CallHour AND DATEPART(HOUR, EndDateTime) = CallHour THEN DATEPART(MINUTE, EndDateTime) - DATEPART(MINUTE, StartDateTime)
        WHEN DATEPART(HOUR, StartDateTime) < CallHour AND DATEPART(HOUR, EndDateTime) = CallHour THEN DATEPART(MINUTE, EndDateTime)
        WHEN DATEPART(HOUR, StartDateTime) = CallHour AND DATEPART(HOUR, EndDateTime) > CallHour THEN 60 - DATEPART(MINUTE, StartDateTime)
        ELSE 0
    END     AS CallTimePerCallPerHour
INTO    #cross
FROM    PhoneCalls
    CROSS JOIN #hours

-- The final result
SELECT  CallHour,
    SUM(CallTimePerCallPerHour) AS CallTime
FROM    #cross
WHERE   CallTimePerCallPerHour > 0  -- To left out hours with 0 CallTime in the result
GROUP BY CallHour

DROP TABLE #hours
DROP TABLE #cross

For about 1M rows, with some random start/end time ranging from 00:00 to 23:59 (worst case scenario for the set-based approach), runtime dropped from 90s to 20s.

As for the different dates, it probably still be a good idea to loop through the dates, otherwise the intermediate table would get too big.

Also there might be calls started and ended on different dates. This could be tricky to deal with, but since they gonna be the minority anyway, maybe just need to add a pre-process step to pick them out and split them up:

3/15/2016 23:49 - 3/16/2016 00:05 =>
3/15/2016 23:49 - 3/15/2016 23:60 // instead of 24:00 so we don't need to count hour 24
3/16/2016 00:00 - 3/16/2016 00:05
Community
  • 1
  • 1