0

We collect data about three different servers, about every minute, and store it in SQL Server Express 2019 table named 'DirectAccessStat'.

We need to report on the total number of connections (across the three servers) in 15 minutes intervals.

I have a SQL query that works, but is very slow, and need tips on either how we can improve the query, or suggest a different approach.

Data format:

Id     Server    Time                    Connections  CPU  Memory  Health
------ --------- ----------------------- ------------ ---- ------- -------
20250  Server3   2020-03-23 16:43:31.000 495          14   43      OK
20249  Server2   2020-03-23 16:43:29.000 550          20   75      OK
20248  Server1   2020-03-23 16:43:27.000 509          10   76      OK
20247  Server3   2020-03-23 16:42:24.000 494          8    43      OK
20246  Server2   2020-03-23 16:42:22.000 569          14   75      OK
20245  Server1   2020-03-23 16:42:20.000 510          20   76      OK
20244  Server3   2020-03-23 16:41:18.000 492          7    43      OK
20243  Server2   2020-03-23 16:41:16.000 565          11   75      OK
20242  Server1   2020-03-23 16:41:13.000 517          6    76      OK
20241  Server3   2020-03-23 16:40:11.000 495          14   43      OK
20240  Server2   2020-03-23 16:40:09.000 566          8    75      OK
20239  Server1   2020-03-23 16:40:06.000 519          16   76      OK
20238  Server3   2020-03-23 16:39:04.000 505          16   43      OK
20237  Server2   2020-03-23 16:39:02.000 574          12   75      OK
20236  Server1   2020-03-23 16:38:59.000 522          7    76      OK
20235  Server3   2020-03-23 16:37:57.000 515          10   43      OK
20234  Server2   2020-03-23 16:37:55.000 590          15   75      OK
20233  Server1   2020-03-23 16:37:52.000 523          6    76      OK
20232  Server3   2020-03-23 16:36:50.000 522          9    43      OK
20231  Server2   2020-03-23 16:36:47.000 583          15   75      OK
20230  Server1   2020-03-23 16:36:45.000 538          11   76      OK
20229  Server3   2020-03-23 16:35:43.000 520          10   43      OK
20228  Server2   2020-03-23 16:35:41.000 585          11   75      OK
20227  Server1   2020-03-23 16:35:38.000 539          8    76      OK
20226  Server3   2020-03-23 16:34:36.000 531          8    43      OK
20225  Server2   2020-03-23 16:34:34.000 602          17   75      OK
20224  Server1   2020-03-23 16:34:31.000 547          9    76      OK
20223  Server3   2020-03-23 16:33:28.000 533          9    43      OK
20222  Server2   2020-03-23 16:33:26.000 591          15   75      OK
20221  Server1   2020-03-23 16:33:24.000 545          11   76      OK

Desired output (TotalConnections is the sum of the first row of all the servers after the given 15 mintues interval):

Time             TotalConnections
---------------- ----------------
23.03 16:30      1703
23.03 16:15      1868
23.03 16:00      2046
23.03 15:45      2189
23.03 15:30      2336
23.03 15:15      2454
23.03 15:00      2531
23.03 14:45      2555
23.03 14:30      2621
23.03 14:15      2643
...

SQL that works, but is very slow (and is run as a new sub-query for every iteration, so output is clunky..):

-- Find last whole quarter
DECLARE @now DateTime = GetDate();
DECLARE @minute int = 0;
DECLARE @nowminute int = DATEPART(MINUTE, @now);

IF @nowminute > 45
    SET @minute = 45;
ELSE
    IF @nowminute > 30
        SET @minute = 30;
    ELSE
        IF @nowminute > 15
            SET @minute = 15;

-- Set start time to last whole quarter
DECLARE @lookTime DateTime = DATETIME2FROMPARTS(YEAR(@now), MONTH(@now), DAY(@now), DATEPART(HOUR, @now), @minute, 0, 0, 0);

-- Continue lookup until specified end datetime
DECLARE @endTime DateTime = DATEADD(WEEK, -1, @lookTime);

-- Output time parts
DECLARE @datepart varchar(10);
DECLARE @hourpart varchar(2);
DECLARE @minutepart varchar(2);

WHILE @lookTime > @endTime
BEGIN

    -- Set output time parts
    SET @datepart = CONVERT(varchar(5), @lookTime, 104);
    SET @hourpart = RIGHT('0' + CAST(DATEPART(HOUR, @lookTime) as varchar(2)), 2);
    SET @minutepart = RIGHT('0' + CAST(DATEPART(MINUTE, @lookTime) as varchar(2)), 2);

    SELECT (@datepart+' '+@hourpart+':'+@minutepart) AS [Time], SUM(D.Connections) AS TotalConnections
    FROM DirectAccessStat D
    INNER JOIN (
        -- Select minimum Id grouped by server, after selected looktime
        SELECT MIN(Id) AS Id
        FROM DirectAccessStat
        WHERE [Time] > @lookTime and [Time] < DATEADD(MINUTE, 20, @lookTime)
        GROUP BY [Server]
    ) AS I
    ON D.Id = I.Id;

    -- Subtract 15 minutes from looktime
    SET @lookTime = DATEADD(MINUTE, -15, @lookTime);
END
sticky bit
  • 36,626
  • 12
  • 31
  • 42
Norrz
  • 1
  • 1
  • 3
    I'd start by getting rid of that `WHILE`. That's going to very easily be your problem. You're writing SQL, not C#. RDBMS excel at set based solutions and are awful at iterative ones. – Thom A Mar 23 '20 at 15:48
  • SQL is a set based language. You don't loop your record row by row. – Eric Mar 23 '20 at 16:15

2 Answers2

0

There's sort of a presudo SQL, because I'm not familiar with SQL Server dialect. I'll use some commands that SQL Server offers, but I can't fix the query:

SELECT intervalBeginning as "Time", COUNT(*) as "TotalConnections" 
FROM (
  SELECT DATEADD(minute, ROUND(DATEDIFF(minute, 0, Time) / 15.0, 0) * 15, 0) AS intervalBeginning
  FROM DirectAccessStat
) intervals
GROUP BY intervalBeginning 
ORDER BY intervalBeginning 

Is it faster? I don't know, but writing pure SQL is sometimes by far faster, since you don't mess up with database's internals, and instead you just declaretively tell what you need, leaving the optimizations to the query optimizer.

What the query does, is it selects in the inner query only the time from your given data, but truncates this time to the nearest 15 minutes interval. Assuming that SQL Server supports selecting duplicates, we get many interval beginnings and some of them are the same. So in the outer query we group by these interval beginnings, which allowes us to output a single interval beginning and to calculate a total number of such beginnings met. Ordering intervals from past to future is just for a pretty output

Alexey S. Larionov
  • 6,555
  • 1
  • 18
  • 37
0

Thanks for the suggestions. I stepped away from the while loop and did it with pure SQL. Got inspiration from this thread.

SELECT D2.Time, SUM(D2.AvgConnections) AS TotalConnections FROM (
SELECT
    MIN(D.Id) AS Id,
    [Server] AS Server,
    (
        RIGHT('0'+CAST(DATEPART(MONTH, D.[Time]) as varchar(2)), 2)
        + '.' + RIGHT('0'+CAST(DATEPART(DAY, D.[Time]) as varchar(2)), 2)
        + ' '+ RIGHT('0'+CAST(DATEPART(HOUR, D.[Time]) as varchar(2)), 2)
        + ':'+ RIGHT('0'+CAST(((DATEPART(MINUTE, D.[Time]) / 15) * 15) as varchar(2)), 2)
    ) AS [Time],
    AVG(Connections) AS AvgConnections
FROM DirectAccessStat D
GROUP BY
    [Server],
    DATEPART(YEAR, D.[Time]),
    DATEPART(MONTH, D.[Time]),
    DATEPART(DAY, D.[Time]),
    DATEPART(HOUR, D.[Time]),
    (DATEPART(MINUTE, D.[Time]) / 15)
) D2
GROUP BY D2.Time
ORDER BY D2.Time DESC

Outputs:

Time        TotalConnections
----------- ----------------
03.23 18:45 1089
03.23 18:30 1120
03.23 18:15 1154
03.23 18:00 1182
03.23 17:45 1207
03.23 17:30 1244
03.23 17:15 1321
03.23 17:00 1413
03.23 16:45 1511
03.23 16:30 1626
03.23 16:15 1785
03.23 16:00 1949
Norrz
  • 1
  • 1