We have a specific query that runs a lot slower when inside a proc. I have to add here that it is enclosed within a two level-cursor. However, both cursors have an iteration result-set of one line.
Let me first state things that we tried and failed:
- Avoiding parameter sniffing by using option (recompile) and option (optiimize for (@var UNKNOWN)
- This thread. The variables that seems to be the problem are actually local ones and not proc parameters.
Here is the query as taken from inside the proc/cursors.
select @tpdim1 = dim1, @tpdim2 = dim2, @typecalc = typecalc
from loyalty_policy where code=@loop2_loyalty_policy
Note: @loop2_loyalty_policy is the var taken from the result of the inner cursor, and has one value. code
is PK to the loyalty_policy
table. Thus, @tpdim1 and @tpdim2 have a single value each.
SET STATISTICS PROFILE ON
SET STATISTICS xml on
insert into @tbl_loyal_loop2 (cnt, store, map, pda, insdate, line, item, loyalty_policy_data, loyal_calc, loyalty_policy)
select @cnt, t.store, t.map, t.pda, t.insdate, t.line, t.item, ld.tab_id,
case @typecalc
when 1 then convert(bigint,round(isnull(t.valueFromTran,0.00) * ld.value , 0 ) )
when 2 then convert(bigint,round(isnull(t.netvalue,0.00) * ld.value , 0 ) )
when 3 then convert(bigint,isnull(t.qty,0) * ld.value )
when 4 then convert(bigint,round(isnull(t.valueFromPrice2,0.00) * ld.value , 0 ) )
when 5 then convert(bigint,round(isnull(t.valueFromPrice3,0.00) * ld.value , 0 ) )
when 6 then convert(bigint,round(isnull(t.valueFromPrice4,0.00) * ld.value , 0 ) )
else 0 end
,@loop2_loyalty_policy
from loyalty_policy_data ld-- with (index=ind_loyalty_policy_02)
inner join #tbl_data t on t.insdate >= ld.fdateactive and t.insdate <= ld.tdateactive
where ld.loyalty_policy = @loop2_loyalty_policy
and ld.tdateactive >= @from_rundate and ld.fdateactive <= @to_rundate
and t.dbupddate > @loop1_dbupddate
and
case when @tpdim1 is null then ''
else
case @tpdim1
when 'STORE' then t.store when 'BRAND' then t.brand when 'CAT1' then t.cat1 when 'CAT2' then t.cat2 when 'CAT3' then t.cat3 when 'ITEM' then t.item
when 'CUSTGROUP' then t.custgroup when 'CUSTGROUP2' then t.custgroup2 when 'CUSTGROUP3' then t.custgroup3
when 'CUSTOMER' then @customer
else '' end
end
= case when @tpdim1 is null then '' else ld.dim1 end
and
case when @tpdim2 is null then ''
else
case @tpdim2
when 'STORE' then t.store when 'BRAND' then t.brand when 'CAT1' then t.cat1 when 'CAT2' then t.cat2 when 'CAT3' then t.cat3 when 'ITEM' then t.item
when 'CUSTGROUP' then t.custgroup when 'CUSTGROUP2' then t.custgroup2 when 'CUSTGROUP3' then t.custgroup3
when 'CUSTOMER' then @customer
else '' end
end
= case when @tpdim2 is null then '' else ld.dim2 end
SET STATISTICS xml off
The SET STATISTICS XML
for the above returns this plan.
In trying to debug it, we isolated the query in the following form (here, you can also see how table #a is made, which has exactly the same data with the previous #tbl_data):
drop table #a;
select dt.dbupddate, dt.insdate, dt.map, dt.pda, pt.line, pt.item,
( pt.exp_qty - pt.imp_qty) as qty,
( pt.exp_value + pt.imp_value ) as netvalue,
( (document.exp_val - document.imp_val) * (pt.netvalue - pt.vat_value) ) as valueFromTran,
( (document.exp_val - document.imp_val) * ( ( (pt.qty - pt.qty_gift) * isnull(pt.price2,0.00) ) * (1.00-( pt.disc_perc / 100)) ) ) as valueFromPrice2,
( (document.exp_val - document.imp_val) * ( ( (pt.qty - pt.qty_gift) * isnull(pt.price3,0.00) ) * (1.00-( pt.disc_perc / 100)) ) ) as valueFromPrice3,
( (document.exp_val - document.imp_val) * ( ( (pt.qty - pt.qty_gift) * isnull(pt.price4,0.00) ) * (1.00-( pt.disc_perc / 100)) ) ) as valueFromPrice4,
dt.store, item.brand, item.cat1, item.cat2, item.cat3, customer.custgroup, customer.custgroup2, customer.custgroup3
into #a
from document with (nolock)
inner join dt with (nolock) on dt.doccode = document.code
inner join store with (nolock) on store.code = dt.store and store.calc_loyal = 1
inner join customer with (nolock) on customer.code = dt.customer
inner join pt with (nolock) on dt.map = pt.map and dt.pda=pt.pda
inner join item with (nolock) on item.code = pt.item and item.itemtype in (select code from itemtype with (nolock) where vsales = 1)
where dt.canceled = 0 and document.is_opposite = 0 and document.type = 3 and dt.customer=N'EL4444444'
and dt.insdate >= '20180109' and dt.insdate <= '20190108' ;
SET STATISTICS PROFILE ON
select t.store, t.map, t.pda, t.insdate, t.line, t.item, ld.tab_id,
case 4
when 1 then convert(bigint,round(isnull(t.valueFromTran,0.00) * ld.value , 0 ) )
when 2 then convert(bigint,round(isnull(t.netvalue,0.00) * ld.value , 0 ) )
when 3 then convert(bigint,isnull(t.qty,0) * ld.value )
when 4 then convert(bigint,round(isnull(t.valueFromPrice2,0.00) * ld.value , 0 ) )
when 5 then convert(bigint,round(isnull(t.valueFromPrice3,0.00) * ld.value , 0 ) )
when 6 then convert(bigint,round(isnull(t.valueFromPrice4,0.00) * ld.value , 0 ) )
else 0 end
,'003'
--select count(*)
from loyalty_policy_data ld with (index=ind_loyalty_policy_02)
inner join #a t on t.insdate >= ld.fdateactive and t.insdate <= ld.tdateactive
where ld.loyalty_policy = '003'
--and ld.tdateactive >= '20180109' and ld.fdateactive <= '20190108'
and t.dbupddate > '20000101'
and
case when 'CUSTOMER' is null then ''
else
case 'CUSTOMER'
when 'STORE' then t.store when 'BRAND' then t.brand when 'CAT1' then t.cat1 when 'CAT2' then t.cat2 when 'CAT3' then t.cat3 when 'ITEM' then t.item
when 'CUSTGROUP' then t.custgroup when 'CUSTGROUP2' then t.custgroup2 when 'CUSTGROUP3' then t.custgroup3
when 'CUSTOMER' then 'EL0134366'
else '' end
end
= case when 'CUSTOMER' is null then '' else ld.dim1 end
and
case when 'BRAND' is null then ''
else
case 'BRAND'
when 'STORE' then t.store when 'BRAND' then t.brand when 'CAT1' then t.cat1 when 'CAT2' then t.cat2 when 'CAT3' then t.cat3 when 'ITEM' then t.item
when 'CUSTGROUP' then t.custgroup when 'CUSTGROUP2' then t.custgroup2 when 'CUSTGROUP3' then t.custgroup3
when 'CUSTOMER' then 'EL0134366'
else '' end
end
= case when 'BRAND' is null then '' else ld.dim2 end
SET STATISTICS PROFILE off
And here is the execution plan. This runs a LOT faster.
Why this humongous difference? From my limited knowledge of execution analyzing, I have noticed
- The first (slow) query, on the
index spool
operation, has an estimated rows of ~9700 but actual rows of 3 million. - The second query has utilized many operations with parallelism
- The only "real" difference I can see in the second query is the hand-substituted values of the @tpdim1 and @tpdim2 values. Sure enough, when we went inside the first query's proc code, and replaced the @tpdim1 & @tpdim2 with the single values they should get, it ran as fast as the second query.
Could you please explain this difference and propose some advice to fix the procedure?
Edit: As Laughing Vergil recommended, I replaced the literals in the second query with variables previously declared, and again it run slow!
Edit 2: I have some additional info from doing some further research.
First, I have isolated the problem to this line:
case when @tpdim1 is null then ''
<-- This uses the slow plan
case when 'CUSTOMER' is null then ''
<-- This uses the fast plan
This is true in the ad-hoc query, no need to trouble ourselves with spcs and/or cursors.
This keeps hapenning even if I change the code to the recommended dynamic where structure.
I haven't created any sampla data yet, but the important info (as can be seen in the plans) is that loyalty_policy_data
has about 720k lines if we only filter by loyalty_policy = @loop2_loyalty_policy
. However, if we evaluate the @tpdim1 condition, which essentially is dim1=N'EL0134366', the rows returned are only 4.
The difference in the plan, then, is when this condition is evaluated in regards to the date-checking conditions.
In the fast plan, it gets evaluated first - When seeking the index for the loyalty policy value, it adds a (non-seek) predicate. While this predicate is not within the index, the returned rows are 4 and all the other operators have "logical" sizes.
In contrast, the slow plan painfully disregards this predicate until too late. If I've figured correctly, it makes a nested loops on loyalty_policy_data as the outer table (which is crazy). It passes the needed columns in as outer references. For each such tuple, the index spool scans the #table(~1k rows) and finds about 250 results, and passes that to the filter which finally does the tpdim1 filtering. Thus, 250*700k rows are passed to the filter operator.
So now I think I know what happens. But I can't figure why.