2

I've written a UDF for re-use in an ETL cleansing process and am trying to quantify which approach is better. I'm wondering if my assumption is correct that a CTE will compute a value and materialize it when called, but am concerned it may just abstract the computation being run multiple times.

Sample code to consider:

with cte as (select 'This is a test////////$$$$$$$$' as val),
    cteReplaceDollar as (select replace(val, '$', '') as val from cte),
    cteReplaceSlash as (select replace(val, '/', '') as val from cteReplaceDollar),
    cteReplaceEmpty as (select replace(val, ' ', '') as val from cteReplaceSlash)
select 
    * 
from 
    cteReplaceEmpty
where
    val = 'Thisisatest' and
    val > '' and
    isnumeric(val) = 0;

Is this performing the replace 3 times (an optimistic assumption) or 18 times (a pessimistic assumption)? Basically, does it materialize the intermediate steps similar to variable assignment, or does it work as a generator expression in which each step is performed for every reference of the value and every value in the collection?

Solonotix
  • 449
  • 3
  • 15
  • 1
    [This SO question seems related](https://stackoverflow.com/questions/13090037/t-sql-cte-materializing-techniques-not-working-on-sql-server-2012). The general answer appears to be that SQL Server might materialize a CTE, depending on how you write it. – Tim Biegeleisen Mar 06 '19 at 01:45
  • I think the answer is: It depends... But this is not enough to place an answer ;-) You tell the engine **what** you want, but not **how to act**. The engine will find the best execution plan (at least: tries to find). The same query can lead to quite different plans depending on things you cannot know in advance. Sometimes I miss something like `WITH cte FORCE PRECOMPUTE AS(blah)` ... – Shnugo Mar 06 '19 at 09:11
  • 1
    @Shnugo I try to follow that principle of WHAT vs HOW, but in this case I was specifically performance-tuning in the hopes of improving the processing of an existing function, so I had to be concerned with the HOW. – Solonotix Mar 06 '19 at 18:40

1 Answers1

4

Generally, SQL Server doesn't materialise CTE, as opposed to, say, Postgres.

You can confirm it by examining the actual execution plan for your query. I'd recommend SentryOne Plan Explorer, it is a great tool.

https://www.sentryone.com/plan-explorer

I expect to see 7 calls to replace in your example.


Well, I miscalculated. The real answer is:

you should check the actual execution plan.

In your example it looks like this:

Filer

9 calls to replace in Filter operator.

Compute scalar

plus 3 calls in Compute Scalar operator.

12 in total.


So, we confirmed that SQL Server didn't materialise CTE in this example. (It was SQL Server 2017 Developer Edition)

Some further reading:

What's the difference between a CTE and a Temp Table?

Is there a performance difference between CTE , Sub-Query, Temporary Table or Table Variable?

Use of With Clause in SQL Server

There is a suggestion for Microsoft to add a Materialize hint for CTE, similar to what Oracle offers: T-SQL Common Table Expression "Materialize" Option

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • 1
    Thanks for all the good reference material. I use SentryOne Plan Explorer a ton for work, and I didn't think to use it to analyze the plan for expressions. I guess I should re-think how I apply these principles, as I used CTE's frequently to compartmentalize code re-use, and hoped it would eagerly evaluate and store in TempDB rather than dynamically present it everytime it was called. Ah well, learn from your mistakes. Thanks again. – Solonotix Mar 06 '19 at 18:36