17

I have a performance question about Common table expressions in SQL Server. In our developer team we use a lot of chaining CTEs when building our queries. I am currently working on a query which had terrible performance. But I found out that if I in the middle of the chain inserted all the records up to that CTE in a temporary table instead and then continued but selecting from that temp table I improved performance significantly. Now I would like to get some help to understand if this type of change only applies to this specific query and why the two cases you will see below differ so much in performance. Or could we possibly overuse CTEs in our team and can we gain performance generally by learning from this case?

Please try to explain to me exactly what is happening here...

The code is complete and you will be able to run it on SQL Server 2008 and probably 2005 too. One part is commented out and my idea is that you can switch the two cases by comment out one or the other. You can see where to put your block comments, I have marked these places with --block comment here and --end block comment here

It is the slow performing case that is uncommented default. Here you are:

--Declare tables to use in example.
CREATE TABLE #Preparation 
(
    Date DATETIME NOT NULL
    ,Hour INT NOT NULL
    ,Sales NUMERIC(9,2)
    ,Items INT
);

CREATE TABLE #Calendar
(
    Date DATETIME NOT NULL
)

CREATE TABLE #OpenHours
(
    Day INT NOT NULL,
    OpenFrom TIME NOT NULL,
    OpenTo TIME NOT NULL
);

--Fill tables with sample data.
INSERT INTO #OpenHours (Day, OpenFrom, OpenTo)
VALUES
    (1, '10:00', '20:00'),
    (2, '10:00', '20:00'),
    (3, '10:00', '20:00'),
    (4, '10:00', '20:00'),
    (5, '10:00', '20:00'),
    (6, '10:00', '20:00'),
    (7, '10:00', '20:00')

DECLARE @CounterDay INT = 0, @CounterHour INT = 0, @Sales NUMERIC(9, 2), @Items INT;

WHILE @CounterDay < 365
BEGIN
    SET @CounterHour = 0;
    WHILE @CounterHour < 5
    BEGIN
        SET @Items = CAST(RAND() * 100 AS INT);
        SET @Sales = CAST(RAND() * 1000 AS NUMERIC(9, 2));
        IF @Items % 2 = 0
        BEGIN
            SET @Items = NULL;
            SET @Sales = NULL;
        END

        INSERT INTO #Preparation (Date, Hour, Items, Sales)
        VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'), @CounterHour + 13, @Items, @Sales);

        SET @CounterHour += 1;
    END
    INSERT INTO #Calendar (Date) VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'));
    SET @CounterDay += 1;
END

--Here the query starts.
;WITH P AS (
    SELECT DATEADD(HOUR, Hour, Date) AS Hour
        ,Sales
        ,Items
    FROM #Preparation
),
O AS (
        SELECT DISTINCT DATEADD(HOUR, SV.number, C.Date) AS Hour
        FROM #OpenHours AS O
            JOIN #Calendar AS C ON O.Day = DATEPART(WEEKDAY, C.Date)
            JOIN master.dbo.spt_values AS SV ON SV.number BETWEEN DATEPART(HOUR, O.OpenFrom) AND DATEPART(HOUR, O.OpenTo)
),
S AS (
    SELECT O.Hour, P.Sales, P.Items
    FROM O
        LEFT JOIN P ON P.Hour = O.Hour
)

--block comment here case 1 (slow performing)
--With this technique it takes about 34 seconds.
,N AS (
        SELECT  
            A.Hour
            ,A.Sales AS SalesOrg
            ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
                THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
            ,A.Items AS ItemsOrg
            ,COALESCE(B.Items, C.Items, 1) AS Items
        FROM S AS A
        OUTER APPLY (SELECT TOP 1 *
                     FROM S
                     WHERE Hour <= A.Hour
                        AND Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0                      
                     ORDER BY Hour DESC) B
        OUTER APPLY (SELECT TOP 1 *
                     FROM S
                     WHERE Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0
                     ORDER BY Hour) C
    )
--end block comment here case 1 (slow performing)

/*--block comment here case 2 (fast performing)
--With this technique it takes about 2 seconds.
SELECT * INTO #tmpS FROM S;

WITH
N AS (
        SELECT  
            A.Hour
            ,A.Sales AS SalesOrg
            ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
                THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
            ,A.Items AS ItemsOrg
            ,COALESCE(B.Items, C.Items, 1) AS Items
        FROM #tmpS AS A
        OUTER APPLY (SELECT TOP 1 *
                     FROM #tmpS
                     WHERE Hour <= A.Hour
                        AND Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0                      
                     ORDER BY Hour DESC) B
        OUTER APPLY (SELECT TOP 1 *
                     FROM #tmpS
                     WHERE Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0
                     ORDER BY Hour) C
    )
--end block comment here case 2 (fast performing)*/
SELECT * FROM N ORDER BY Hour


IF OBJECT_ID('tempdb..#tmpS') IS NOT NULL DROP TABLE #tmpS;

DROP TABLE #Preparation;
DROP TABLE #Calendar;
DROP TABLE #OpenHours;

If you would like to try and understand what I am doing in the last step I have a SO question about it here.

For me case 1 takes about 34 seconds and case 2 takes about 2 seconds. The difference is that I store the result from S in a temp table in case 2, in case 1 I use S in my next CTE directly.

Community
  • 1
  • 1
John
  • 2,043
  • 5
  • 28
  • 49
  • 1
    +1 For runnable code. I suggest running them both and pasting the execution plan XML into SQL Sentry Plan Explorer. The reason for the difference will then be quite evident. It ends up scanning `#Preparation` 20,000 times in one part of the plan and 10,000 times in another part for an example. – Martin Smith May 25 '12 at 10:02
  • Thanks. I installed SQL Sentry Plan Explorer. I am still learning SQL Server and I am not able to read execution plans, neither in Sentry Plan. But the answer is that one cannot say anything about if its best with CTE or temp table in certain scenarios without investigating the execution plans? Where do you see 20,000 and 10,000 times? I can't find it. Is it possible to say something about why one part is scanning #Preparation 20,000 times by referring to the code? – John May 25 '12 at 10:40
  • Have you seen this? "SQL 2005 CTE vs TEMP table Performance when used in joins of other tables" http://stackoverflow.com/questions/1531835/sql-2005-cte-vs-temp-table-performance-when-used-in-joins-of-other-tables – JV. May 25 '12 at 13:33

3 Answers3

14

A CTE is essentially just a disposable view. It will pretty much never make a query any faster than just putting the CTE code into a FROM clause as a table expression.

In your example, the real issue is the date functions I believe.

Your first (slow) case requires the date functions to be run for every row.

For your second (faster) case they are run once and stored into a table.

This is not normally so noticeable unless you do some sort of logic on the function-derived field. In your case you are doing an ORDER BY on Hour, which is very costly. In your second example it's a simple sort on a field, but in the first you are running that function for each row, THEN sorting.

For a lot more in-depth reading on CTEs, see this question on DBA.SE.

Community
  • 1
  • 1
JNK
  • 63,321
  • 15
  • 122
  • 138
6

CTE is just syntax shortcut. That CTE is run (and re-run) in the the join. With the #temp it gets evaluated once and then the results are re-used in the join.

The documentation is misleading.

MSDN_CTE

A common table expression (CTE) can be thought of as a temporary result set.

This article explains it better

PapaCTEarticle

A CTE is a nice fit for this type of scenario since it makes the T-SQL much more readable (like a view), yet it can be used more than once in a query that immediately follows in the same batch. Of course, it is not available beyond that scope. Additionally, the CTE is a language-level construct—meaning that SQL Server does not internally create temp or virtual tables. The CTE's underlying query will be called each time it is referenced in the immediately following query.

Take a look at Table Value Parameters

TVP

They have the structure like a #temp but not as much overhead. They are read only but it appears you only need read only. Creating and dropping a #temp will vary but on a low to mid server it is a 0.1 second hit and with TVP there is essentially not hit.

paparazzo
  • 44,497
  • 23
  • 105
  • 176
  • TVPs are not a good recommendation for this. There a potentially a lot of issues with them, not least of which is that they exist outside transactional scope and therefore can't be rolled back! – JNK May 25 '12 at 14:42
  • @JNK Help me out. TVP is read only so why do I care if it can't be rolled back. This sample is select only no insert, update, or delete. Where is the risk? You have helped me more than once and have always been correct. – paparazzo May 25 '12 at 14:49
  • 1
    I have a general distrust of TVPs for anything other than output or testing. They still write to the log file and have disk IO overhead but aren't actually atomic and can't be indexed, don't have statistics, and exec plans always assume a single row for them. – JNK May 25 '12 at 14:55
  • 2
    [Here is an excellent reference comparing temp tables and tvps.](http://dba.stackexchange.com/questions/16385/whats-the-difference-between-a-temp-table-and-table-variable-in-sql-server) – JNK May 25 '12 at 14:57
0

CTE is a very nice syntactic sugar, making queries much more readable. However, on large datasets, the performance is catastrophic in my experience, and I have had to replace them all by temp tables with specific indexes as needed.

For example:

SELECT IdBL, LgnBL, chemin, IdBE, IdLot, SUM(CuTrait) AS CuTraitBE
INTO #temp_arbo_of_8_cte
FROM #CoutTraitParBE
GROUP BY IdBL, LgnBL, chemin, IdBE, IdLot;

CREATE NONCLUSTERED INDEX #temp_arbo_of_8_cte_index_1 ON #temp_arbo_of_8_cte(chemin, IdBE, IdLot);

SELECT a.*, CuTraitBE, ROUND(CuTraitBE * QteSortieBE, 3) AS CoutTraitParBE, QteFactParBE*PxVte AS CaParBE
INTO #temp_arbo_of_8
FROM #temp_arbo_of_7 a
LEFT JOIN #temp_arbo_of_8_cte b ON a.chemin=b.chemin AND a.IdBE=b.IdBE AND a.IdLot=b.IdLot;

/*
WITH cte AS (
    SELECT IdBL, LgnBL, chemin, IdBE, IdLot, SUM(CuTrait) AS CuTraitBE 
    FROM #CoutTraitParBE
    GROUP BY IdBL, LgnBL, chemin, IdBE, IdLot
)
SELECT a.*, CuTraitBE, ROUND(CuTraitBE * QteSortieBE, 3) AS CoutTraitParBE, QteFactParBE*PxVte AS CaParBE
INTO #temp_arbo_of_8
FROM #temp_arbo_of_7 a
LEFT JOIN cte b ON a.chemin=b.chemin AND a.IdBE=b.IdBE AND a.IdLot=b.IdLot;
*/

With the cte version, the query optimizer would get lost and generate an awfully complex execution plan. The query would run forever. Wthout cte it runs in a split second.

So Yes, cte can be a massive performance problem !

Ludovic Aubert
  • 9,534
  • 4
  • 16
  • 28