I've written a UDF for re-use in an ETL cleansing process and am trying to quantify which approach is better. I'm wondering if my assumption is correct that a CTE will compute a value and materialize it when called, but am concerned it may just abstract the computation being run multiple times.
Sample code to consider:
with cte as (select 'This is a test////////$$$$$$$$' as val),
cteReplaceDollar as (select replace(val, '$', '') as val from cte),
cteReplaceSlash as (select replace(val, '/', '') as val from cteReplaceDollar),
cteReplaceEmpty as (select replace(val, ' ', '') as val from cteReplaceSlash)
select
*
from
cteReplaceEmpty
where
val = 'Thisisatest' and
val > '' and
isnumeric(val) = 0;
Is this performing the replace 3 times (an optimistic assumption) or 18 times (a pessimistic assumption)? Basically, does it materialize the intermediate steps similar to variable assignment, or does it work as a generator expression in which each step is performed for every reference of the value and every value in the collection?