0

I have this taxing query that takes 5 minutes, and I'd like to have its results stored in a table, which will be queried by a backend.

Also, the table should be updated every hour and completely replace the table's content with the new data.

I found this Server Agent solution, and from my understanding I should to this:

  1. Create the results table.
  2. Create a stored procedure that drops the results table data, runs the 5 min query, and inserts the new data.
  3. Create the Server Agent job that runs the procedure in intervals.

Is this the optimal way?

Here is the query in question. It goes over a ton of rows, which I think is the main speed impactor.

WITH 
salesCTE AS (
    select
        concat(year(m.addate), '-', format(m.addate, 'MM')) AS ym,
        year(m.addate) as y,
        format(m.addate, 'MM') as m,
        sum(M.anvalue) as salesRev
    FROM tHE_Move m
    WHERE ( 
        RIGHT(LEFT(M.acKey,5),3) = '300'
        OR RIGHT(LEFT(M.acKey,5),3) = '305'
        OR RIGHT(LEFT(M.acKey,5),3) = '319'
        OR RIGHT(LEFT(M.acKey,5),3) = '380'
        OR RIGHT(LEFT(M.acKey,5),3) = '355'
        OR RIGHT(LEFT(M.acKey,5),3) = '360'
        OR RIGHT(LEFT(M.acKey,5),3) = '3X1'
        OR RIGHT(LEFT(M.acKey,5),3) = '395'
    ) and m.adDate between '01.01.2014' and '01.01.2030'
    GROUP BY 
        concat(year(m.addate), '-', format(m.addate, 'MM')),
        year(m.addate),
        format(m.addate, 'MM')
),
retailCTE AS (
    select
        concat(year(m.addate), '-', format(m.addate, 'MM')) AS ym,
        year(m.addate) as y,
        format(m.addate, 'MM') as m,
        sum(M.anvalue) as retailRev
    FROM tHE_Move m
    where (
        RIGHT(LEFT(M.acKey,5),3) = '321'
        OR RIGHT(LEFT(M.acKey,5),3) = '322'
        OR RIGHT(LEFT(M.acKey,5),3) = '323'
        OR RIGHT(LEFT(M.acKey,5),3) = '324'
        OR RIGHT(LEFT(M.acKey,5),3) = '325'
        OR RIGHT(LEFT(M.acKey,5),3) = '326'
        OR RIGHT(LEFT(M.acKey,5),3) = '327'
        OR RIGHT(LEFT(M.acKey,5),3) = '328'
        OR RIGHT(LEFT(M.acKey,5),3) = '329'
        OR RIGHT(LEFT(M.acKey,5),3) = '331'
        OR RIGHT(LEFT(M.acKey,5),3) = '332'
        OR RIGHT(LEFT(M.acKey,5),3) = '333'
        OR RIGHT(LEFT(M.acKey,5),3) = '334'
        OR RIGHT(LEFT(M.acKey,5),3) = '335'
        OR RIGHT(LEFT(M.acKey,5),3) = '336'
        OR RIGHT(LEFT(M.acKey,5),3) = '337'
        OR RIGHT(LEFT(M.acKey,5),3) = '338'
        OR RIGHT(LEFT(M.acKey,5),3) = '339'
        OR RIGHT(LEFT(M.acKey,5),3) = '341'
        OR RIGHT(LEFT(M.acKey,5),3) = '342'
        OR RIGHT(LEFT(M.acKey,5),3) = '343'
        OR RIGHT(LEFT(M.acKey,5),3) = '344'
        OR RIGHT(LEFT(M.acKey,5),3) = '345'
        OR RIGHT(LEFT(M.acKey,5),3) = '346'
        OR RIGHT(LEFT(M.acKey,5),3) = '347'
        OR RIGHT(LEFT(M.acKey,5),3) = '348'
        OR RIGHT(LEFT(M.acKey,5),3) = '349'
        OR RIGHT(LEFT(M.acKey,5),3) = '352'
        OR RIGHT(LEFT(M.acKey,5),3) = '353'
        ) and m.adDate between '01.01.2014' and '01.01.2030'
    GROUP BY 
        concat(year(m.addate), '-', format(m.addate, 'MM')),
        year(m.addate),
        format(m.addate, 'MM')
)
SELECT 
    s1.ym,
    s1.salesRev,
    (s1.salesRev / s2.salesRev - 1) * 100 salesDelta,
    r1.retailRev,
    (r1.retailRev / r2.retailRev - 1) * 100 retailDelta,
    s1.salesRev + r1.retailRev totalRev,
    ((s1.salesRev + r1.retailRev) / (s2.salesRev + r2.retailRev) - 1) * 100 totalDelta
FROM salesCTE s1
    left join salesCTE s2
        on s2.y = s1.y - 1 and s1.m = s2.m
    left join retailCTE r1
        on s1.ym = r1.ym
    left join retailCTE r2
        on r2.y = r1.y - 1 and r1.m = r2.m
order by s1.ym desc
Ivan
  • 1,967
  • 4
  • 34
  • 60
  • 1
    That works, but is your actual question on how to improve the performance of your query? If so, we need said query to help you improve it. – Thom A Mar 19 '19 at 12:22
  • 3
    Doesn't sound ideal unless you can be unavailable 8% of the time. You should have the new query results ready to go before dropping the old one – Martin Smith Mar 19 '19 at 12:24
  • 1
    Either with the Server Agent solution, but usually I do it through a Scheduler Tool like BMC Control-M or similiar – Esteban P. Mar 19 '19 at 12:25
  • @Larnu That should be a separate question I guess. I added the query. – Ivan Mar 19 '19 at 12:29
  • @MartinSmith I added the query in question. So are you saying that I should simply reverse the actions i.e. have the query run, store the results in a temp table, and **then** drop the current data, and finally have it replaced? – Ivan Mar 19 '19 at 12:30
  • 1
    `RIGHT(LEFT(M.acKey,5),3)` is going to be costly in your `WHERE`. Have you considered having that information stored as a persisted computed column; that's probably indexed. – Thom A Mar 19 '19 at 12:31
  • 1
    Also, I recommend against `FORMAT`. That function is a huge performance killer. I've had queries that ran in minutes run in seconds (or less) by getting rid of `FORMAT` and using `CONVERT`, `DATEPART`, `DATENAME` etc. I would personally replace `format(m.addate, 'MM')` with `RIGHT('00' + CONVERT(varchar(2),DATEPART(MONTH,m.addate)),2)`. It does look more complex, but it's far more performant. – Thom A Mar 19 '19 at 12:31
  • 1
    A staging table - not a temp table. That can then replace the actual table. Either via a drop and rename or `ALTER TABLE ... SWITCH` – Martin Smith Mar 19 '19 at 12:34
  • 1
    `FORMAT` vs `CONVERT` [DB<>Fiddle](https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=df9a706db82a6a1f92dfdb4b663527da) – Thom A Mar 19 '19 at 12:41
  • @Larnu Your suggestions dropped the time from 5 min to 30 sec. I didn't realize that the date and right/left parts impacted the query so much. I guess this changes things up, as I can now completely abandon the server agent approach and just use the query as the speed is good enough. Can you please write your suggestions as the answer so I can accept it? It didn't answer the question, but it solved my problem. Thank you. – Ivan Mar 19 '19 at 12:48

1 Answers1

2

As mentioned in the comment FORMAT is a massive performance hitter; and my massive i really do mean massive. Take this DB<>Fiddle which uses FORMAT and CONVERT to change the value of a date. The query using CONVERT executes if 46ms (on db fiddle), yet the FORMAT query took 1218 ms! That's 26 times slower.

Changing format(m.addate, 'MM') to RIGHT('00' + CONVERT(varchar(2),DATEPART(MONTH,DATEADD(DAY,N-1,0))),2) will have significant performance benefits on your query. Although the latter looks more complex, it will (as the fiddle shows) out perform FORMAT by a significant margin. Honestly, I recommend never using FORMAT, Microsoft got things really wrong with that function.

I do, also, however, suggest adding RIGHT(LEFT(M.acKey,5),3) as a persisted column to your table:

ALTER TABLE tHE_Move ADD {Meaningful Name} AS RIGHT(LEFT(M.acKey,5),3) PERSISTED;

Then you can also add that value to an index (new or existing) and it'll also greatly benefit the performance of your query; maybe pushing it to only a few seconds.

Thom A
  • 88,727
  • 11
  • 45
  • 75
  • Thank you for the answer. Although it wasn't about the scheduling, it eliminated the need for it. The 5 min query now takes 30 sec, which will be even less after some optimization. – Ivan Mar 19 '19 at 12:56