174

Which are more performant, CTE or Temporary Tables?

DineshDB
  • 5,998
  • 7
  • 33
  • 49
Blankman
  • 259,732
  • 324
  • 769
  • 1,199
  • 4
    Related question: http://dba.stackexchange.com/q/13112 – Rachel Jan 22 '15 at 14:52
  • Users can find background information (not relating to performance) in the [Using Common Table Expressions](https://technet.microsoft.com/en-us/library/ms190766(v=sql.105).aspx) page on technet.microsoft.com. – Sheridan Oct 08 '15 at 10:26
  • Anyone finding this beware, this is SQL SERVER specific, I found it googling redshift... which is based on postgres. Postgres treats CTEs as temp tables, while MSSQL treats them as views, and god only knows what redshift is doing under the hood... – user433342 Jun 16 '23 at 13:14

12 Answers12

259

It depends.

First of all

What is a Common Table Expression?

A (non recursive) CTE is treated very similarly to other constructs that can also be used as inline table expressions in SQL Server. Derived tables, Views, and inline table valued functions. Note that whilst BOL says that a CTE "can be thought of as temporary result set" this is a purely logical description. More often than not it is not materlialized in its own right.

What is a temporary table?

This is a collection of rows stored on data pages in tempdb. The data pages may reside partially or entirely in memory. Additionally the temporary table may be indexed and have column statistics.

Test Data

CREATE TABLE T(A INT IDENTITY PRIMARY KEY, B INT , F CHAR(8000) NULL);

INSERT INTO T(B)
SELECT TOP (1000000)  0 + CAST(NEWID() AS BINARY(4))
FROM master..spt_values v1,
     master..spt_values v2;

Example 1

WITH CTE1 AS
(
SELECT A,
       ABS(B) AS Abs_B,
       F
FROM T
)
SELECT *
FROM CTE1
WHERE A = 780

Plan 1

Notice in the plan above there is no mention of CTE1. It just accesses the base tables directly and is treated the same as

SELECT A,
       ABS(B) AS Abs_B,
       F
FROM   T
WHERE  A = 780 

Rewriting by materializing the CTE into an intermediate temporary table here would be massively counter productive.

Materializing the CTE definition of

SELECT A,
       ABS(B) AS Abs_B,
       F
FROM T

Would involve copying about 8GB of data into a temporary table then there is still the overhead of selecting from it too.

Example 2

WITH CTE2
     AS (SELECT *,
                ROW_NUMBER() OVER (ORDER BY A) AS RN
         FROM   T
         WHERE  B % 100000 = 0)
SELECT *
FROM   CTE2 T1
       CROSS APPLY (SELECT TOP (1) *
                    FROM   CTE2 T2
                    WHERE  T2.A > T1.A
                    ORDER  BY T2.A) CA 

The above example takes about 4 minutes on my machine.

Only 15 rows of the 1,000,000 randomly generated values match the predicate but the expensive table scan happens 16 times to locate these.

enter image description here

This would be a good candidate for materializing the intermediate result. The equivalent temp table rewrite took 25 seconds.

INSERT INTO #T
SELECT *,
       ROW_NUMBER() OVER (ORDER BY A) AS RN
FROM   T
WHERE  B % 100000 = 0

SELECT *
FROM   #T T1
       CROSS APPLY (SELECT TOP (1) *
                    FROM   #T T2
                    WHERE  T2.A > T1.A
                    ORDER  BY T2.A) CA 

With Plan

Intermediate materialisation of part of a query into a temporary table can sometimes be useful even if it is only evaluated once - when it allows the rest of the query to be recompiled taking advantage of statistics on the materialized result. An example of this approach is in the SQL Cat article When To Break Down Complex Queries.

In some circumstances SQL Server will use a spool to cache an intermediate result, e.g. of a CTE, and avoid having to re-evaluate that sub tree. This is discussed in the (migrated) Connect item Provide a hint to force intermediate materialization of CTEs or derived tables. However no statistics are created on this and even if the number of spooled rows was to be hugely different from estimated is not possible for the in progress execution plan to dynamically adapt in response (at least in current versions. Adaptive Query Plans may become possible in the future).

TheGameiswar
  • 27,855
  • 8
  • 56
  • 94
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • 46
    This is the only answer that answers the actual question (which is asking which has better performance not what is the difference or which is your favorite), and it answers that question correctly: "It depends" is the right answer. It is also the only answer with supporting data to explain, several others (with high numbers of votes) make definite claims that one is better than the other with no references or proof... To be clear, all of those answers are also **wrong**. Because "It depends" – Arkaine55 Jul 06 '15 at 19:11
  • 4
    It is also a well written, well referenced answer. Seriously top notch. – Dan Williams Aug 15 '19 at 15:59
  • 2
    I love to highlight this part which I found to be true Intermediate materialisation of part of a query into a temporary table can sometimes be useful even if it is only evaluated once – Mark Monforti Sep 02 '20 at 15:34
74

I'd say they are different concepts but not too different to say "chalk and cheese".

  • A temp table is good for re-use or to perform multiple processing passes on a set of data.

  • A CTE can be used either to recurse or to simply improved readability.
    And, like a view or inline table valued function can also be treated like a macro to be expanded in the main query

  • A temp table is another table with some rules around scope

I have stored procs where I use both (and table variables too)

gbn
  • 422,506
  • 82
  • 585
  • 676
  • 16
    Temp tables also allows for Indexes and even Statistics which are sometimes necessary, while a CTE does not. – CodeCowboyOrg Sep 04 '14 at 15:30
  • 13
    I think this answer doesn't highlight enough the fact that CTE's can lead to terrible performance. I usually refer to this [answer](http://dba.stackexchange.com/a/13117/65699) on dba.stackexchange. Your question comes up second in my search engine if I'm looking up `cte vs temporary tables` so IMHO this answer needs to highlight the drawbacks of CTE's better. TL;DR of the linked answer: *a CTE should never be used for performance.*. I agree with that quote as I've experienced the downsides of CTE's. – TT. Feb 03 '16 at 07:53
  • 3
    @TT. Interesting. I find that CTEs perform much better – J.S. Orris Jul 21 '17 at 20:33
  • 1
    This is a pretty old q/a, but came up near the top of my search so just wanted to add a point in favor of CTEs that hasn't been mentioned here: they can be used in views, temp tables and table variables can't. – 8forty Apr 24 '22 at 20:20
61

CTE has its uses - when data in the CTE is small and there is strong readability improvement as with the case in recursive tables. However, its performance is certainly no better than table variables and when one is dealing with very large tables, temporary tables significantly outperform CTE. This is because you cannot define indices on a CTE and when you have large amount of data that requires joining with another table (CTE is simply like a macro). If you are joining multiple tables with millions of rows of records in each, CTE will perform significantly worse than temporary tables.

CSW
  • 619
  • 5
  • 2
  • 11
    I've seen this from my own experience. CTE's perform significantly slower. – goku_da_master Jan 25 '11 at 17:36
  • 9
    CTE's also perform slower because the results are not cached. So everytime you use the CTE it re-runs the query, plan and all. – goku_da_master Jan 25 '11 at 19:23
  • 2
    And the db engine may choose to re-run the query not only every reference, but for every _row_ of the consumer query, as a correlated subquery... you must always watch out for that if it is not desired. – Mike M Jul 11 '14 at 12:36
  • 1
    The temp table is stored in tempdb on SQL Server , which is disk but has the benefit of being indexed and SQL optimizer works well on select queries in that case. Not sure on which db or disk area the CTE is stored (when it exceeds the memory size and is queued for IO paging)but its never optimzed with the large volume of data. I have used the compiler option (with recompile) sometimes to make it faster – rmehra76 Jun 16 '20 at 20:32
41

Temp tables are always on disk - so as long as your CTE can be held in memory, it would most likely be faster (like a table variable, too).

But then again, if the data load of your CTE (or temp table variable) gets too big, it'll be stored on disk, too, so there's no big benefit.

In general, I prefer a CTE over a temp table since it's gone after I used it. I don't need to think about dropping it explicitly or anything.

So, no clear answer in the end, but personally, I would prefer CTE over temp tables.

GEOCHET
  • 21,119
  • 15
  • 74
  • 98
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 3
    In the case of SQLite and PostgreSQL, temporary tables *are* automatically dropped (usually at the end of a session). I don't know about other DBMS though. – Serrano Feb 11 '13 at 12:50
  • 2
    CTE is like a temp view. AFAIK data isn't stored so nothing van be held in memory or stored on disk. Important note, everytime you use the CTE the query runs again. – Rob Apr 29 '13 at 14:38
  • 2
    Personally I have never seen a CTE work better than a Temp table for speed. And well debugging is much easier with temp table – Mark Monforti Sep 14 '16 at 18:56
9

So the query I was assigned to optimize was written with two CTEs in SQL server. It was taking 28sec.

I spent two minutes converting them to temp tables and the query took 3 seconds

I added an index to the temp table on the field it was being joined on and got it down to 2 seconds

Three minutes of work and now its running 12x faster all by removing CTE. I personally will not use CTEs ever they are tougher to debug as well.

The crazy thing is the CTEs were both only used once and still putting an index on them proved to be 50% faster.

Mark Monforti
  • 463
  • 4
  • 8
8

I've used both but in massive complex procedures have always found temp tables better to work with and more methodical. CTEs have their uses but generally with small data.

For example I've created sprocs that come back with results of large calculations in 15 seconds yet convert this code to run in a CTE and have seen it run in excess of 8 minutes to achieve the same results.

Andy_RC
  • 81
  • 1
  • 3
  • Yes I love this comment. There seems to be this odd paradigm that if I can write something with one line of code instead of two I should. I am debugging something right now that has 13 CTE's nested in it and the CTEs are called data1-data13. Total insanity. – Mark Monforti Sep 02 '20 at 15:33
7

CTE won't take any physical space. It is just a result set we can use join.

Temp tables are temporary. We can create indexes, constrains as like normal tables for that we need to define all variables.

Temp table's scope only within the session. EX: Open two SQL query window

create table #temp(empid int,empname varchar)
insert into #temp 
select 101,'xxx'

select * from #temp

Run this query in first window then run the below query in second window you can find the difference.

select * from #temp
shA.t
  • 16,580
  • 5
  • 54
  • 111
selvaraj
  • 889
  • 2
  • 16
  • 29
  • 6
    >> "it is just a result set we can use join." -> This is not accurate. CTE is not a a "result set" but inline code. SQL Server query engine parse the CTE code as part of the query text and build an execution plan according. The idea that CTE is inline is the big advantage of using CTE, since it allows the server to create a "combine execution plan" – Ronen Ariely Aug 29 '15 at 09:24
5

One use where I found CTE's excelled performance wise was where I needed to join a relatively complex Query on to a few tables which had a few million rows each.

I used the CTE to first select the subset based of the indexed columns to first cut these tables down to a few thousand relevant rows each and then joined the CTE to my main query. This exponentially reduced the runtime of my query.

Whilst results for the CTE are not cached and table variables might have been a better choice I really just wanted to try them out and found the fit the above scenario.

purchas
  • 349
  • 7
  • 19
  • Also, I think since i only use the CTE in the join I only really execute the CTE once in my query so caching the results was not such a big issue in this respect – purchas Sep 06 '12 at 00:34
5

Late to the party, but...

The environment I work in is highly constrained, supporting some vendor products and providing "value-added" services like reporting. Due to policy and contract limitations, I am not usually allowed the luxury of separate table/data space and/or the ability to create permanent code [it gets a little better, depending upon the application].

IOW, I can't usually develop a stored procedure or UDFs or temp tables, etc. I pretty much have to do everything through MY application interface (Crystal Reports - add/link tables, set where clauses from w/in CR, etc.). One SMALL saving grace is that Crystal allows me to use COMMANDS (as well as SQL Expressions). Some things that aren't efficient through the regular add/link tables capability can be done by defining a SQL Command. I use CTEs through that and have gotten very good results "remotely". CTEs also help w/ report maintenance, not requiring that code be developed, handed to a DBA to compile, encrypt, transfer, install, and then require multiple-level testing. I can do CTEs through the local interface.

The down side of using CTEs w/ CR is, each report is separate. Each CTE must be maintained for each report. Where I can do SPs and UDFs, I can develop something that can be used by multiple reports, requiring only linking to the SP and passing parameters as if you were working on a regular table. CR is not really good at handling parameters into SQL Commands, so that aspect of the CR/CTE aspect can be lacking. In those cases, I usually try to define the CTE to return enough data (but not ALL data), and then use the record selection capabilities in CR to slice and dice that.

So... my vote is for CTEs (until I get my data space).

Marc
  • 778
  • 7
  • 18
4

I just tested this- both CTE and non-CTE (where the query was typed out for every union instance) both took ~31 seconds. CTE made the code much more readable though- cut it down from 241 to 130 lines which is very nice. Temp table on the other hand cut it down to 132 lines, and took FIVE SECONDS to run. No joke. all of this testing was cached- the queries were all run multiple times before.

user2989981
  • 89
  • 2
  • 2
  • 10
3

From my experience in SQL Server,I found one of the scenarios where CTE outperformed Temp table

I needed to use a DataSet(~100000) from a complex Query just ONCE in my stored Procedure.

  • Temp table was causing an overhead on SQL where my Procedure was performing slowly(as Temp Tables are real materialized tables that exist in tempdb and Persist for the life of my current procedure)

  • On the other hand, with CTE, CTE Persist only until the following query is run. So, CTE is a handy in-memory structure with limited Scope. CTEs don't use tempdb by default.

This is one scenario where CTEs can really help simplify your code and Outperform Temp Table. I had Used 2 CTEs, something like

WITH CTE1(ID, Name, Display) 
AS (SELECT ID,Name,Display from Table1 where <Some Condition>),
CTE2(ID,Name,<col3>) AS (SELECT ID, Name,<> FROM CTE1 INNER JOIN Table2 <Some Condition>)
SELECT CTE2.ID,CTE2.<col3>
FROM CTE2
GO
Il Vic
  • 5,576
  • 4
  • 26
  • 37
Amardeep Kohli
  • 503
  • 3
  • 7
  • 1
    Your answer seems to be very generic... How do you measure that "CTE outperformed Temp table"? Have you got some time measurements? In my opinion you should edit your answer and add more details. – Il Vic Sep 07 '15 at 10:19
  • Yes i have time measurements and Execution plan to support my statement. – Amardeep Kohli Sep 07 '15 at 10:50
  • Cannot add the img for execution plan because of limited privileges.Will update details once it is resolved – Amardeep Kohli Sep 07 '15 at 10:52
3

This is a really open ended question, and it all depends on how its being used and the type of temp table (Table variable or traditional table).

A traditional temp table stores the data in the temp DB, which does slow down the temp tables; however table variables do not.

JoshBerke
  • 66,142
  • 25
  • 126
  • 164